echo ""; echo ""; echo ""; echo ""; if(!is_admin){echo "";} » 信息学院2007年度学术报告总结 仲子说

2008-02-26

信息学院2007年度学术报告总结

Posted in 实验室 at 0:16 Author:仲远

标签:

???Towards Unified Theory of Uncertainty

????Prof. Dan Ralescu????????????

???2007?3?29???15?00

????????????????

???As part of mixed models of uncertainty, we will study random sets and their applications, as well as set-valued probabilities. As an extension of these results, next we will introduce the concept of fuzzy random variable (FRV), study its properties, and discuss some applications to statistical inference with fuzzy data. The law of large numbers and the central limit theorem for FRV’s will also be explored. Then we will study the concept of fuzzy probability, its properties, and prove a law of large numbers with respect to a fuzzy probability. Finally, we investigate aggregation of fuzzy concept, by using both the fuzzy integral, and the Choquet integral.

 

???Introduction to holomorphic dynamics

???????, ??????????????????

???2007?5?28????????15:30

????????????

???Holomorphic dynamics is an active and fertile area of research and has also close connections with many other disciplines. I shall give an introduction to this topic.

 

???The Research Issues for Advanced Technologies and Applications in Data Mining

????Professor Jiawei Han???UIUC???

???5?30????????10:30

????????????

???Research in data mining has two general directions: theoretical foundations and advanced technologies and applications. In this talk, we will focus on the research issues for advanced technologies and applications in data mining and discuss some recent progress in this direction, including (1) pattern mining, usage, and understanding, (2) information network analysis, (3) stream data mining, (4) mining moving object data, RFID data, and data from sensor networks, (5) spatiotemporal and multimedia data mining, (6) biological data mining, (7) text and Web mining, (8) data mining for software engineering and computer system analysis, and (9) data cube-oriented multidimensional online analytical analysis.

 

???Nature of Invention in Computer Science

????Dennis Shasha???????????????

???2007?6?8? ??

????????????

 

???MonetDB: A nextgeneration database kernel for query intensive applications

????Stefan???????????????(CWI)

???2007?6?11?

???????????????????????

 

???High-Performance Database Technology for Hierarchical Memory Systems

????Sandor Heman???????????????(CWI)

???2007?6?11?

???????????????????????

 

???MonetDB/XQuery?A Fast XQuery Processor Powered by a Relational Engine

????Stratos Idreos???????????????(CWI)

???2007?6?11?

???????????????????????

 

???????????????????????

????Anastassia Ailamaki???, ??Carnegie Mellon??

???2007?6?14?

???????????????????????

 

???Automatic Annotation of Structured Data from the Deep Web     

????Prof. Weiyi Meng?????????Binghanton??

???6?25????????10?30~11?30

????????????

???An increasing number of databases in the deep Web have become Web accessible through HTML form-based search interfaces. The data units returned from the underlying database are usually encoded into the result pages dynamically for human browsing. For the encoded data units to be machine processable, which is essential for many applications such as deep Web data collection and comparison-shopping, they need to be extracted out and assigned meaningful labels. In this talk, I will present a multi-annotator approach for automatic data unit annotation. This approach consists of two steps. It first aligns the data units into different groups such that the data in the same group have the same semantics. Then for each group, we annotate it from different aspects and aggregate the different annotations to prodict a final annotation label. Experimental results indicate that the proposed solutions are highly effective.

 

???Mining salient patterns in high dimensional data        

????A/Prof. Wei Wang? University of North Carolina

???6?25????????2?30~3?30

????????????

???The advances of new technologies have made data collection easier and faster, resulting in large and complex datasets consisting of hundreds of thousands of objects with hundreds of dimensions. Scalable and efficient unsupervised clustering methods have been the most popular approaches in analyzing these large datasets. Traditional clustering approaches typically partition objects into disjoint groups based on distances in full dimensional space. However, more often than not, some dimensions of high dimensional data may be irrelevant to a cluster and can mask the cluster’s existence. This phenomenon, called the curse of dimensionality, prevents salient structures from being discovered by traditional clustering approaches. We developed unsupervised clustering approaches to capture pattern-preserving and noise-tolerant clusters in the subspaces of high dimensional space. The proposed subspace clustering algorithms tackle the curse of dimensionality by localizing the search of clusters in the subspaces of the original high dimensional data. They go beyond the existing distance-based clustering criteria by revealing consistent patterns that can be far apart in distance.

 

???Native XML Support in DB2 9 for z/OS     

??????????IBM?????

???2007?9?10?????? 15:00

???????????????

???XML databases opened a new era for enterprise data management. In this talk, I will discuss why XML databases will take the center stage of data management, its technology drive and business drive. Then based on engineering factors for database scalability, I will give an overview of how pureXML in DB2 9 for z/OS is designed, including architecture, storage, indexing, query evaluation and QuickXScan XPath algorithm, and what are the major differences of DB2’s approach from other major commercial XML database implementations. I will also point out future directions, some of the challenges and research issues.

 

???Network Intelligence

????Prof.Deyi Li, Qinghua University

???2007?9?14?????? ??15:00?17:30

??????????????

???Network is the key to representing the complex world around us. Small changes in the topology, affecting only a few of the nodes, can open up hidden doors, allowing new possibilities to emerge. While network intelligence is considered in my talk, it is always stressed and focused on a kernel idea, i.e. topology first, mainly concerning the self-organization, self-similarity and emergency features. Taking network topology as a novel approach of knowledge representation, we discuss the dynamics of information (or virus, etc) spreading on Web, mining typical topology from real information networks at multi-scale, and emergence computation as well.

Brain science has achieved a great success on molecule-level and cell-level research; however, there is still a long way to go for cognitive function of a brain as a whole. How can we understand the non-linear function of a brain How does the left brain (with the priority of logic thinking) cooperate with the right brain (with the priority of visual thinking) How far away for “von-Neumann-style” computer architecture ? May the future computer architecture consist of dual core, one for logic thinking and the other for visual thinking, which correlate each other all the time May the future operating systems are developed under the mechanism of “growth by preferential attachment” I am interested in all these questions in my talk.

 

???????????

??????? ????????????????

???2007?9?14?????? ??15:00?17:30

??????????????

????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

 

???Web??IBM???????

????????IBM?????

???2007?9?14?????? ??15:00?17:30

??????????????

?????????Web??IBM??????????????????IBM?????????????????????????????????????????????????DB2 9?????????????????????

 

???Grid security via Behavior conformity from Trusted Computing and virtualization

????Dr. Wenbo Mao?EMC Corporation

???2007?9?15?????? ??15:30?17:00

??????????????

???Ideally a grid is a virtual machine or virtual organization (VO) of unbounded computational and storage capacity built by pooling heterogeneous resources from real organizations (lessors). Currently such grids are only seen in scientific or academic communities. To maximally utilize their resources, commercial enterprises, like resource-abundant financial institutions, should > ‘‘go for grid,’’ and become lessors. Inadequate grid security currently prevents commercial organizations with under-utilized resources from being lessors. A missing security service is behavior conformity: VO code mustn’t damage the lessor, and conversely, the lessor mustn’t compromise the VO’s proprietary information.

Project Daoli strengthens grid security by adding behavior conformity in three levels of virtualization with software components to be tamper-protected by TCG technologies. At the OS level, the protected component is a highly-privileged hypervisor that intercepts interrupts for memory isolation and persistent storage protection. At the application level, the component is a grid application plus protected data. A third level of virtualization, which is realized by grid middleware, enables one piece of code to run across the VO’s heterogeneous environment; policy enforcement is achieved simply by propagating this code with the protective credential being migrated along the TCG-technology enabled platforms.

 

??????????????????

????????????????????????? ??

???2007?9?15?????? ??15:30?17:00

??????????????

 

????????????????

?????????????????????

???2007?9?15?????? ??15:30?17:00

??????????????

 

???Core Role-Based Access Control: Efficient Implementations by Transformations     

????Annie Liu  (???)     State University of New York at Stony Brook

???2007?9?17????? ???10:00 11:30

???????????????417???

???This talk describes a transformational method applied to the core component of role-based access control (RBAC), to derive efficient implementations from a specification based on the ANSI standard for RBAC. The method is based on the idea of incrementally maintaining the result of expensive set operations, where a new method is described and used for systematically deriving incrementalization rules. We calculate precise complexities for three variants of efficient implementations as well as for a straightforward implementation based on the specification. We describe successful prototypes and experiments for the efficient implementations and for automatically generating efficient implementations from straightforward implementations. We will end with an overview of our other work on generating efficient implementations for constrained RBAC, information flow analysis, and trust management policy analysis.

 

???Web Data Management Research at Microsoft Research Asia

?????????, ????????????

???2007?9?27? ??

????????????

 

???Object-level Web Search

?????????, ???????????????????

???2007?9?27? ??

????????????

 

???Relevance Ranking in Information Retrieval

?????????, ?????????????????????

???2007?9?27? ??

????????????

 

????????????SOA??????????     

??????? ???IBM????????????????

???2007?10?31????? 14:00 17:00

???????????????????

?????????????????????????SOA???????SOA?????????????????????????SOA??????

 

???Behavior-Based Detection of Internet Worms     

????Dr. Jun Li????????????????????????

???12?5???????10:00

????????4?414??

???Self-propagating worms pose a significant threat to the Internet. Since their first major appearance in 1988, these malicious programs have been exploiting software vulnerabilities to gain control of unwitting hosts with increasing sophistication.   However, accurately detecting Internet worms in their early stages remains an unsolved problem.

In this talk, the speaker will first survey the trend of Internet worms and describe the state-of-the-art worm detection mechanisms.  He will show the limitations of content-based approaches, which scan Internet traffic for worm signatures or suspicious byte patterns, and host-based approaches, which rely on the installation of new mechanisms on individual machines. He will then introduce a new worm detection paradigm based on inspecting behaviors of self-propagating worms.  In particular, he will describe his SWORD project that checks the traffic crossing the border of a network in order to look for the self-similarity, destination pattern, and continuity of worm connections.

 

?????????????????      

??????????Bedford ??????????

???2007?12?13???9?30-11?00

???????????????

??????????????????1???????????????????????33??297??????????????????????1?????????????????297?????????33  × 11??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????1977????????Riverst, Shamire, Addleman ?????????????????????????????????????????????, ??????????????????, ??Riverst, Shamire, Addleman?2003????????????????????.?????????????????????,?????????????????????????????????????: Primality Testing and Integer Factorization in Public-Key Cryptography, 2nd Edition, Springer, 2008.

 

???Security and Integrity in Outsourcing of Data Mining

????Prof. David Wai-lok Cheung?The University of Hong Kong

???2007?12?17?????? ??8:40?9:10

??????????????

???Outsourcing of data mining to an outside service provider brings important benefits to the data owner. These include (i) relief from the high mining cost, (ii) minimization of demands in resources, and (iii) effective centralized mining for multiple distributed owners. However, security and integrity are issues that must be tackled before enterprises can indeed outsource data mining task. The service provider should be prevented from accessing the actual data (security), and the results returned to the owner must be authentic (integrity).

  In this talk, we will present the result of a secure association rules mining algorithm that we published recently to explain the outsourcing model and to illustrate the feasibility of an approach we used. In protecting the security in mining association rules, a substitution cipher technique has been proposed in the encryption of transactional data. After identifying the non-trivial threats to a straightforward one-to-one item mapping substitution cipher, we propose a novel secure encryption algorithm based on a one-to-n item mapping that transforms transactions non-deterministically, yet guarantees correct decryption. We will also discuss the integrity problem in the same outsourcing model for association mining.

 

???Real-time Near-Duplicate Video Clip Retrieval

????Dr. HengTao Shen?University of Queensland

???2007?12?17?????? ??9:10?9:40

??????????????

???Near-duplicate video clip (NDVC) retrieval is an important problem with a wide range of applications such as TV broadcast monitoring,video copyright enforcement, content-based video clustering and annotation, etc. For a large database with tens of thousands of video clips, each with thousands of frames, can NDVC search be performed in real-time? In this talk, we discusses NDVC retrieval, including its background, concepts, applications, and challenges. Our key idea is to use a highly compact video summarization derived from all video frames. To be effective, the summarization must capture the original video content accurately, and to be efficient, the summarization should be small so that indexing structures can be deployed for fast search. We propose a novel technique that summarizes a video clip into a single and small representative which exploits the frame correlation and captures the dominating content and changing trends. By introducing a prototype system named UQLIPS, we demonstrate that such a summarization can be a practical solution for real-time NDVC retrieval from very large video databases.

 

???Whose Opinion Should You Trust Among a Group of Stock Market Analysts?

????Prof. Kam-Fai Wong?The Chinese University of Hong Kong

???2007?12?17?????? ??10:10?10:40

??????????????

???Stock market reports are helpful for users to make investment decisions. Analysts provide opinions about the trend of the market for the next trading day based on the today’s information. However, users have to mine the opinions of the analysts from large mount of textual information in short time. These information are often contrasting opinions making decision making by the users difficult. In this work, we investigate learning-based and pattern-based approaches to mine analysts’ opinion automatically. We also investigate different opinion incorporation strategies for extracting most reliable opinion. In learning-based approach, n-gram features and carefully designed market features are employed. In pattern-based approaches, certain fixed phrase structures are extracted to identify opinions. Experiments show that patter-based approaches achieve better results. The average precision score of analysts’ predictions is about 60%. The best incorporation strategy is to believe the best analyst.

 

???Out-of-Core Coherent Closed Quasi-Clique Mining from Large Dense Graph Databases

????Dr. Jianyong Wang?Tsinghua University

???2007?12?17?????? ??10:40?11:10

??????????????

???Due to the ability of graphs to represent more generic and more complicated relationships among different objects, graph mining has played a significant role in data mining, attracting increasing attention in the data mining community. In addition, frequent coherent subgraphs can provide valuable knowledge about the underlying internal structure of a graph database, and mining frequently occurring coherent subgraphs from large dense graph databases has witnessed several applications and received considerable attention in the graph mining community recently. In this talk, we introduce how to efficiently mine the complete set of coherent closed quasi-cliques from large dense graph databases, which is an especially challenging task due to the fact that the downward-closure property no longer holds. By fully exploring some properties of quasi-cliques, we propose several novel optimization techniques which can prune the unpromising and redundant subsearch spaces effectively. Meanwhile, we devise an efficient closure checking scheme to facilitate the discovery of closed quasi-cliques only. Since large databases cannot be held in main memory, we also design an out-of-core solution with efficient index structures for mining coherent closed quasi-cliques from large dense graph databases. We call this Cocain*. Thorough performance study shows that Cocain* is very efficient and scalable for large dense graph databases.

 

???A semantic approach for content search and content extraction in XML query processing

????Prof. Tok Wang Ling?National University of Singapore

???2007?12?17?????? ??11:10?11:40

??????????????

 

???Data Quality: A Database Perspective

????Prof. Xiaofang Zhou?University of Queensland

???2007?12?17?????? ??2:00?2:30

??????????????

???There is a clear and strong trend of rising interest in the Data Quality (DQ) problem from the database research community as evidenced by data quality related papers in prominent conferences and journals. The industry response to this issue has already resulted in dedicated tools and technologies to provide reliable data quality control methods. In this talk we will present an overview of the key issues and solutions. It will cover general aspects of the problem space as well as the most significant research achievements in key topics, focusing on data cleaning, linking, and provenance research by the database research community.

 

???Metadata Web: An Infrastructure for Enterprise Data Integration

????Dr. Yue Pan?IBM China Research Lab

???2007?12?17?????? ??2:30?3:00

??????????????

???Efficient metadata management is recognized as a key to enterprise data integration. How to manage multiplying heterogeneous metadata repositories in the enterprise? How to enable the exploiting of dynamic categorization of and relationship between customer, products and services? Can cost of business integration be vastly reduced? Can an enterprise get a dynamic integration platform without replacing its infrastructure? We proposed an approach Metadata Web which may be evolved by itself to solve the above issues. This showcase will introduce the concept of Metadata Web and demonstrate how existing metadata repositories can be transformed with a mixture of open source and IBM technologies into “federated” ones. It will also show how impact analysis and master data analytics that lead to new insight from the semantically enriched metadata. We will also update the state of art enabling technology: Semantic Object Repository and Semantic Search. Finally it concludes with vision into the future of semantic integration.

 

???Effective and Efficient Semantic Web Data Management

????Dr. Li Ma?IBM China Research Lab

???2007?12?17?????? ??3:00?3:30

??????????????

???With the fast growth of Semantic Web, more and more RDF data and ontologies are created and widely used in Web applications and enterprise information systems. It is reported that the data set of the W3C’s LinkingOpenData project consists of over two billion RDF triples, which are interlinked by about three million RDF links. Recently, efficient RDF data management on top of relational databases gains particular attentions from both Semantic Web community and database community. In this presentation, I will discuss major challenges and problems in Semantic Web data management and address these issues by introducing a scalable Semantic Web data management system (SOR) over DB2, including efficient schema and indexes design for storage, practical ontology reasoning support, and an effective SPARQL-to-SQL translation method for RDF query. In particular, I will present a scalable and highly expressive reasoning approach to Semantic Web data, which can process RDF queries over hundreds of millions of RDF triples efficiently, whereas the state of the art reasoners could not scale to this size. Finally, I will show the performance and scalability of well-known RDF stores, and emphasize remaining challenges and future work on Semantic Web data management.

 

???Towards a Chinese Web Infrastructure: Challenges and Our Roadmap

????Dr. Weining Qian?East China Normal University

???2007?12?17?????? ??3:30?4:00

??????????????

???With the development of World-Wide Web, storage and utilization of web data has become a big challenge to data management community. Though many commercial and academic tools emerge, the structure, content, and user behavior of Chinese Web is not fully studied. We are working on building a Chinese Web Infrastructure for support of such research. In this talk, the challenges of building such a system is analyzed, and our technical roadmap is discussed.

 

???XML Data Management Using Semantics in ORA-SS     

????Ling Tok Wang, National University of Singapore (??????????????)

???2007?12?18?(???)  ??2?00-3?30

???????(????)?????

???Traditional semantic data models, such as the Entity Relationship (ER) data model, are used to represent real world semantics that are crucial for the effective management of structured data.

Today, semistructured data has become more prevalent on the Web, and XML has become the de facto standard for semi-structured data. A DTD and an XML Schema of an XML document only reflect the hierarchical structure of the semistructured data stored in the XML document. The hierarchical structures of XML documents are captured by the relationships between an Element and its attributes, and between an element and its subelements. Element-attribute relationships do not have clear semantics, and the relationships between elements and their subelements are binary. The semantics of n-ary relationships with n > 2 cannot be represented or captured correctly and precisely in DTD and XML Schema. Many of the crucial semantics captured by the ER model for structured data are not captured by either DTD or XML Schema. We solve these problems by using a semantic-rich data model called the Object, Relationship, Attribute data model for SemiStructured Data (ORA-SS).

In this talk, I will discuss how to use the semantics captured in the ORA-SS for XML data management, such as XML database design, object-relational storage for XML data, XML view creation and validation, XML graphical query language and output, XML query optimization, XML keyword search, etc.

 

???A Semantic Approach for Content Search and Content Extraction in XML Query Processing        

????Ling Tok Wang, National University of Singapore (??????????????)

???2007?12?19?(???)  ??2?00-3?30

???????(????)?????

???Processing a twig pattern query in XML document includes structural search and content search. Most existing algorithms only focus on structural search. They treat content nodes the same as element nodes during query processing with structural joins. Due to the high variety of contents, mixing content search and structural search suffers from management problem of contents and low performance. Another disadvantage is to find the actual values asked by a query, they have to rely on the original document. In this talk, we propose a novel algorithm Value Extraction with Relational Table (VERT) to overcome these limitations. The main technique of VERT is using relational tables to store document contents instead of treating them as nodes and labeling them. Tables in our algorithm are created based on semantic information of the documents. As more semantics is captured, we can further optimize the tables and enhance the twig pattern querying processing.  We show by experiments that besides solving different content problems, VERT also has superiority in performance of twig pattern query processing compared with existing algorithms.

 

???Effective Semantics for Ranked Keyword Search in XML Database     

????Ling Tok Wang, National University of Singapore (??????????????)

???2007?12?20?(???)  ??2?00-3?30

???????(????)?????

???Keyword search is a user-friendly way to query XML databases. Most previous efforts in this area focus on keyword proximity search in XML based on either tree data model or graph (or digraph) data model. Tree data model for XML is generally simple and efficient for keyword proximity search. However, it cannot capture connections such as ID references in XML databases. In the contrast, techniques based on graph (or digraph) data model capture connections, but are generally inefficient to compute. In this paper, we propose interconnected object trees model for keyword search to achieve the efficiency of tree model and meanwhile to capture the connections such as ID references in XML by fully exploiting the property and schema information of XML databases. In particular, we propose ICA (Interested Common Ancestor) semantics to find all predefined interested objects that contain all query keywords. We also introduce novel IRA (Interested Related Ancestors) semantics to capture the conceptual connections between interested objects and include more objects that only contain some query keywords. Then, a novel ranking metric, RelevanceRank, is studied to dynamically assign higher ranks to objects that are more relevant to a given keyword query according to the conceptual connections in IRAs. Experiment results show our approach outperforms most existing systems in terms of result quality. We have implemented a prototype of our ICRA system (ICRA = ICA + IRA) on the DBLP data.

本文可以自由转载,转载时请保留全文并注明出处:
转载自仲子说 [ http://www.wangzhongyuan.com/ ]
原文链接:

3 Comments »

  1. 过客 said,

    2008年February26日 at 5:35

    兄弟你好。。。。我一个弟弟级的朋友现在非常想要一个人大的校徽。。。。

    可是没处弄。。。你能帮这个忙么?如果可以的话联系我QQ吧。。10418057

    谢谢~~~~~~~

  2. 仲远 said,

    2008年February26日 at 13:14

    你在我的站内搜索“校徽”就会有结果的啊~
    你也可以点击以下链接:
    http://www.wangzhongyuan.com/archives/189.html
    下载中国人民大学校徽

  3. MiloX said,

    2017年June28日 at 23:30

    I see your page needs some fresh content. Writing manually takes a lot of time,
    but there is tool for this time consuming task,
    search in google; murgrabia’s tools

Leave a Comment

*
To prove you're a person (not a spam script), type the security text shown in the picture. Click here to regenerate some new text.
Click to hear an audio file of the anti-spam word