Posted in 实验室 at 0:16 Author:仲远


题目:Towards Unified Theory of Uncertainty

报告人:Prof. Dan Ralescu,美国新西纳提大学数学系



摘要:As part of mixed models of uncertainty, we will study random sets and their applications, as well as set-valued probabilities. As an extension of these results, next we will introduce the concept of fuzzy random variable (FRV), study its properties, and discuss some applications to statistical inference with fuzzy data. The law of large numbers and the central limit theorem for FRV’s will also be explored. Then we will study the concept of fuzzy probability, its properties, and prove a law of large numbers with respect to a fuzzy probability. Finally, we investigate aggregation of fuzzy concept, by using both the fuzzy integral, and the Choquet integral.


题目:Introduction to holomorphic dynamics

报告人:王跃飞, 中国科学院数学与系统科学研究院副院长



摘要:Holomorphic dynamics is an active and fertile area of research and has also close connections with many other disciplines. I shall give an introduction to this topic.


题目:The Research Issues for Advanced Technologies and Applications in Data Mining

报告人:Professor Jiawei Han(美国UIUC大学)



摘要:Research in data mining has two general directions: theoretical foundations and advanced technologies and applications. In this talk, we will focus on the research issues for advanced technologies and applications in data mining and discuss some recent progress in this direction, including (1) pattern mining, usage, and understanding, (2) information network analysis, (3) stream data mining, (4) mining moving object data, RFID data, and data from sensor networks, (5) spatiotemporal and multimedia data mining, (6) biological data mining, (7) text and Web mining, (8) data mining for software engineering and computer system analysis, and (9) data cube-oriented multidimensional online analytical analysis.


题目:Nature of Invention in Computer Science

报告人:Dennis Shasha教授,美国纽约大学计算机系主任

时间:200768 上午



题目:MonetDB: A nextgeneration database kernel for query intensive applications





题目:High-Performance Database Technology for Hierarchical Memory Systems

报告人:Sandor Heman博士,荷兰数学与计算机研究中心(CWI)




题目:MonetDB/XQueryA Fast XQuery Processor Powered by a Relational Engine

报告人:Stratos Idreos博士,荷兰数学与计算机研究中心(CWI)





报告人:Anastassia Ailamaki副教授, 美国Carnegie Mellon大学




题目:Automatic Annotation of Structured Data from the Deep Web     

报告人:Prof. Weiyi Meng,美国纽约州立大学Binghanton分校



摘要:An increasing number of databases in the deep Web have become Web accessible through HTML form-based search interfaces. The data units returned from the underlying database are usually encoded into the result pages dynamically for human browsing. For the encoded data units to be machine processable, which is essential for many applications such as deep Web data collection and comparison-shopping, they need to be extracted out and assigned meaningful labels. In this talk, I will present a multi-annotator approach for automatic data unit annotation. This approach consists of two steps. It first aligns the data units into different groups such that the data in the same group have the same semantics. Then for each group, we annotate it from different aspects and aggregate the different annotations to prodict a final annotation label. Experimental results indicate that the proposed solutions are highly effective.


题目:Mining salient patterns in high dimensional data        

报告人:A/Prof. Wei Wang University of North Carolina



摘要:The advances of new technologies have made data collection easier and faster, resulting in large and complex datasets consisting of hundreds of thousands of objects with hundreds of dimensions. Scalable and efficient unsupervised clustering methods have been the most popular approaches in analyzing these large datasets. Traditional clustering approaches typically partition objects into disjoint groups based on distances in full dimensional space. However, more often than not, some dimensions of high dimensional data may be irrelevant to a cluster and can mask the cluster’s existence. This phenomenon, called the curse of dimensionality, prevents salient structures from being discovered by traditional clustering approaches. We developed unsupervised clustering approaches to capture pattern-preserving and noise-tolerant clusters in the subspaces of high dimensional space. The proposed subspace clustering algorithms tackle the curse of dimensionality by localizing the search of clusters in the subspaces of the original high dimensional data. They go beyond the existing distance-based clustering criteria by revealing consistent patterns that can be far apart in distance.


题目:Native XML Support in DB2 9 for z/OS     


时间:2007910(星期一) 15:00


摘要:XML databases opened a new era for enterprise data management. In this talk, I will discuss why XML databases will take the center stage of data management, its technology drive and business drive. Then based on engineering factors for database scalability, I will give an overview of how pureXML in DB2 9 for z/OS is designed, including architecture, storage, indexing, query evaluation and QuickXScan XPath algorithm, and what are the major differences of DB2’s approach from other major commercial XML database implementations. I will also point out future directions, some of the challenges and research issues.


题目:Network Intelligence

报告人:Prof.Deyi Li, Qinghua University

时间:2007914(星期五) 下午15:0017:30


摘要:Network is the key to representing the complex world around us. Small changes in the topology, affecting only a few of the nodes, can open up hidden doors, allowing new possibilities to emerge. While network intelligence is considered in my talk, it is always stressed and focused on a kernel idea, i.e. topology first, mainly concerning the self-organization, self-similarity and emergency features. Taking network topology as a novel approach of knowledge representation, we discuss the dynamics of information (or virus, etc) spreading on Web, mining typical topology from real information networks at multi-scale, and emergence computation as well.

Brain science has achieved a great success on molecule-level and cell-level research; however, there is still a long way to go for cognitive function of a brain as a whole. How can we understand the non-linear function of a brain How does the left brain (with the priority of logic thinking) cooperate with the right brain (with the priority of visual thinking) How far away for “von-Neumann-style” computer architecture ? May the future computer architecture consist of dual core, one for logic thinking and the other for visual thinking, which correlate each other all the time May the future operating systems are developed under the mechanism of “growth by preferential attachment” I am interested in all these questions in my talk.



报告人:诸葛海 研究员,中国科学院计算技术研究所

时间:2007914(星期五) 下午15:0017:30






时间:2007914(星期五) 下午15:0017:30


摘要:本题目将介绍Web时代IBM对对信息管理所面临的挑战的理解,以及IBM在基础的数据库技术,信息管理架构和信息管理思想等方面的产品创新和观念创新,尤其重点介绍新一代数据库DB2 9所带来的技术变革及其在中国市场的早期实践.


题目:Grid security via Behavior conformity from Trusted Computing and virtualization

报告人:Dr. Wenbo MaoEMC Corporation

时间:2007915(星期六) 下午15:3017:00


摘要:Ideally a grid is a virtual machine or virtual organization (VO) of unbounded computational and storage capacity built by pooling heterogeneous resources from real organizations (lessors). Currently such grids are only seen in scientific or academic communities. To maximally utilize their resources, commercial enterprises, like resource-abundant financial institutions, should > ‘‘go for grid,’’ and become lessors. Inadequate grid security currently prevents commercial organizations with under-utilized resources from being lessors. A missing security service is behavior conformity: VO code mustn’t damage the lessor, and conversely, the lessor mustn’t compromise the VO’s proprietary information.

Project Daoli strengthens grid security by adding behavior conformity in three levels of virtualization with software components to be tamper-protected by TCG technologies. At the OS level, the protected component is a highly-privileged hypervisor that intercepts interrupts for memory isolation and persistent storage protection. At the application level, the component is a grid application plus protected data. A third level of virtualization, which is realized by grid middleware, enables one piece of code to run across the VO’s heterogeneous environment; policy enforcement is achieved simply by propagating this code with the protective credential being migrated along the TCG-technology enabled platforms.



报告人:任锦华,中共中央对外联络部信息化工作办公室 主任

时间:2007915(星期六) 下午15:3017:00





时间:2007915(星期六) 下午15:3017:00



题目:Core Role-Based Access Control: Efficient Implementations by Transformations     

报告人:Annie Liu  (刘燕虹)     State University of New York at Stony Brook

时间:2007917(周一) 上午:10:00 11:30


摘要:This talk describes a transformational method applied to the core component of role-based access control (RBAC), to derive efficient implementations from a specification based on the ANSI standard for RBAC. The method is based on the idea of incrementally maintaining the result of expensive set operations, where a new method is described and used for systematically deriving incrementalization rules. We calculate precise complexities for three variants of efficient implementations as well as for a straightforward implementation based on the specification. We describe successful prototypes and experiments for the efficient implementations and for automatically generating efficient implementations from straightforward implementations. We will end with an overview of our other work on generating efficient implementations for constrained RBAC, information flow analysis, and trust management policy analysis.


题目:Web Data Management Research at Microsoft Research Asia

报告人:文继荣博士, 微软亚洲研究院主任研究员

时间:2007927 上午



题目:Object-level Web Search

报告人:聂再清博士, 微软亚洲研究院互联网数据管理组研究组长

时间:2007927 上午



题目:Relevance Ranking in Information Retrieval

报告人:刘铁岩博士, 微软亚洲研究院信息检索与数据挖掘组研究组长

时间:2007927 上午




报告人:寇卫东 教授,IBM软件集团两岸三地大中华区总工程师

时间:20071031(周三) 14:00 17:00




题目:Behavior-Based Detection of Internet Worms     

报告人:Dr. Jun Li李军博士,美国自然科学基金杰出青年基金获得者)



摘要:Self-propagating worms pose a significant threat to the Internet. Since their first major appearance in 1988, these malicious programs have been exploiting software vulnerabilities to gain control of unwitting hosts with increasing sophistication.   However, accurately detecting Internet worms in their early stages remains an unsolved problem.

In this talk, the speaker will first survey the trend of Internet worms and describe the state-of-the-art worm detection mechanisms.  He will show the limitations of content-based approaches, which scan Internet traffic for worm signatures or suspicious byte patterns, and host-based approaches, which rely on the installation of new mechanisms on individual machines. He will then introduce a new worm detection paradigm based on inspecting behaviors of self-propagating worms.  In particular, he will describe his SWORD project that checks the traffic crossing the border of a network in order to look for the self-similarity, destination pattern, and continuity of worm connections.



报告人:颜松远,英国Bedford 大学计算机研究所所长



摘要:所谓整数分解,就是找出一个大于1的正整数的一个因子(并不要求一定是质因子),如33就是297中的一个因子。所谓质因数分解,就是将一个大于1的正整数分解成质因数的乘积形式,如297的质因数分解式即为33  × 11。显然,整数分解是质因数分解中最重要的一个运算。从计算的角度讲,只要有快速的整数分解算法,就有快速的质因数分解算法,因为质因数分解算法只不过是整数分解算法的一种递归形式。尽管人类寻求快速的整数分解算法至少有两千多年的历史,但迄今为止,仍然没有找到快速的整数分解算法。所谓快速的整数分解算法,就是可以在多项式时间内运行的整数分解算法。从计算理论的角度讲,整数分解问题是一个难解性的问题。正因为它是一个难解性的问题,1977年麻省理工学院的Riverst, Shamire, Addleman 才非常巧妙地利用这个问题的难解性成功地设计出了世界上第一个公钥密码体制,至今仍未被彻底攻破, 且广泛应用于今日之网络与信息安全之中, 为此Riverst, Shamire, Addleman2003年获得享有计算机科学诺贝尔奖之誉的图灵奖.本报告将介绍整数分解中的最新方法和最新进展,以及它们对现代密码学发展的导向与影响。本讲座的内容取自于作者的最新英文专著: Primality Testing and Integer Factorization in Public-Key Cryptography, 2nd Edition, Springer, 2008.


题目:Security and Integrity in Outsourcing of Data Mining

报告人:Prof. David Wai-lok CheungThe University of Hong Kong

时间:20071217(星期一) 上午8:409:10


摘要:Outsourcing of data mining to an outside service provider brings important benefits to the data owner. These include (i) relief from the high mining cost, (ii) minimization of demands in resources, and (iii) effective centralized mining for multiple distributed owners. However, security and integrity are issues that must be tackled before enterprises can indeed outsource data mining task. The service provider should be prevented from accessing the actual data (security), and the results returned to the owner must be authentic (integrity).

  In this talk, we will present the result of a secure association rules mining algorithm that we published recently to explain the outsourcing model and to illustrate the feasibility of an approach we used. In protecting the security in mining association rules, a substitution cipher technique has been proposed in the encryption of transactional data. After identifying the non-trivial threats to a straightforward one-to-one item mapping substitution cipher, we propose a novel secure encryption algorithm based on a one-to-n item mapping that transforms transactions non-deterministically, yet guarantees correct decryption. We will also discuss the integrity problem in the same outsourcing model for association mining.


题目:Real-time Near-Duplicate Video Clip Retrieval

报告人:Dr. HengTao ShenUniversity of Queensland

时间:20071217(星期一) 上午9:109:40


摘要:Near-duplicate video clip (NDVC) retrieval is an important problem with a wide range of applications such as TV broadcast monitoring,video copyright enforcement, content-based video clustering and annotation, etc. For a large database with tens of thousands of video clips, each with thousands of frames, can NDVC search be performed in real-time? In this talk, we discusses NDVC retrieval, including its background, concepts, applications, and challenges. Our key idea is to use a highly compact video summarization derived from all video frames. To be effective, the summarization must capture the original video content accurately, and to be efficient, the summarization should be small so that indexing structures can be deployed for fast search. We propose a novel technique that summarizes a video clip into a single and small representative which exploits the frame correlation and captures the dominating content and changing trends. By introducing a prototype system named UQLIPS, we demonstrate that such a summarization can be a practical solution for real-time NDVC retrieval from very large video databases.


题目:Whose Opinion Should You Trust Among a Group of Stock Market Analysts?

报告人:Prof. Kam-Fai WongThe Chinese University of Hong Kong

时间:20071217(星期一) 上午10:1010:40


摘要:Stock market reports are helpful for users to make investment decisions. Analysts provide opinions about the trend of the market for the next trading day based on the today’s information. However, users have to mine the opinions of the analysts from large mount of textual information in short time. These information are often contrasting opinions making decision making by the users difficult. In this work, we investigate learning-based and pattern-based approaches to mine analysts’ opinion automatically. We also investigate different opinion incorporation strategies for extracting most reliable opinion. In learning-based approach, n-gram features and carefully designed market features are employed. In pattern-based approaches, certain fixed phrase structures are extracted to identify opinions. Experiments show that patter-based approaches achieve better results. The average precision score of analysts’ predictions is about 60%. The best incorporation strategy is to believe the best analyst.


题目:Out-of-Core Coherent Closed Quasi-Clique Mining from Large Dense Graph Databases

报告人:Dr. Jianyong WangTsinghua University

时间:20071217(星期一) 上午10:4011:10


摘要:Due to the ability of graphs to represent more generic and more complicated relationships among different objects, graph mining has played a significant role in data mining, attracting increasing attention in the data mining community. In addition, frequent coherent subgraphs can provide valuable knowledge about the underlying internal structure of a graph database, and mining frequently occurring coherent subgraphs from large dense graph databases has witnessed several applications and received considerable attention in the graph mining community recently. In this talk, we introduce how to efficiently mine the complete set of coherent closed quasi-cliques from large dense graph databases, which is an especially challenging task due to the fact that the downward-closure property no longer holds. By fully exploring some properties of quasi-cliques, we propose several novel optimization techniques which can prune the unpromising and redundant subsearch spaces effectively. Meanwhile, we devise an efficient closure checking scheme to facilitate the discovery of closed quasi-cliques only. Since large databases cannot be held in main memory, we also design an out-of-core solution with efficient index structures for mining coherent closed quasi-cliques from large dense graph databases. We call this Cocain*. Thorough performance study shows that Cocain* is very efficient and scalable for large dense graph databases.


题目:A semantic approach for content search and content extraction in XML query processing

报告人:Prof. Tok Wang LingNational University of Singapore

时间:20071217(星期一) 上午11:1011:40



题目:Data Quality: A Database Perspective

报告人:Prof. Xiaofang ZhouUniversity of Queensland

时间:20071217(星期一) 下午2:002:30


摘要:There is a clear and strong trend of rising interest in the Data Quality (DQ) problem from the database research community as evidenced by data quality related papers in prominent conferences and journals. The industry response to this issue has already resulted in dedicated tools and technologies to provide reliable data quality control methods. In this talk we will present an overview of the key issues and solutions. It will cover general aspects of the problem space as well as the most significant research achievements in key topics, focusing on data cleaning, linking, and provenance research by the database research community.


题目:Metadata Web: An Infrastructure for Enterprise Data Integration

报告人:Dr. Yue PanIBM China Research Lab

时间:20071217(星期一) 下午2:303:00


摘要:Efficient metadata management is recognized as a key to enterprise data integration. How to manage multiplying heterogeneous metadata repositories in the enterprise? How to enable the exploiting of dynamic categorization of and relationship between customer, products and services? Can cost of business integration be vastly reduced? Can an enterprise get a dynamic integration platform without replacing its infrastructure? We proposed an approach Metadata Web which may be evolved by itself to solve the above issues. This showcase will introduce the concept of Metadata Web and demonstrate how existing metadata repositories can be transformed with a mixture of open source and IBM technologies into “federated” ones. It will also show how impact analysis and master data analytics that lead to new insight from the semantically enriched metadata. We will also update the state of art enabling technology: Semantic Object Repository and Semantic Search. Finally it concludes with vision into the future of semantic integration.


题目:Effective and Efficient Semantic Web Data Management

报告人:Dr. Li MaIBM China Research Lab

时间:20071217(星期一) 下午3:003:30


摘要:With the fast growth of Semantic Web, more and more RDF data and ontologies are created and widely used in Web applications and enterprise information systems. It is reported that the data set of the W3C’s LinkingOpenData project consists of over two billion RDF triples, which are interlinked by about three million RDF links. Recently, efficient RDF data management on top of relational databases gains particular attentions from both Semantic Web community and database community. In this presentation, I will discuss major challenges and problems in Semantic Web data management and address these issues by introducing a scalable Semantic Web data management system (SOR) over DB2, including efficient schema and indexes design for storage, practical ontology reasoning support, and an effective SPARQL-to-SQL translation method for RDF query. In particular, I will present a scalable and highly expressive reasoning approach to Semantic Web data, which can process RDF queries over hundreds of millions of RDF triples efficiently, whereas the state of the art reasoners could not scale to this size. Finally, I will show the performance and scalability of well-known RDF stores, and emphasize remaining challenges and future work on Semantic Web data management.


题目:Towards a Chinese Web Infrastructure: Challenges and Our Roadmap

报告人:Dr. Weining QianEast China Normal University

时间:20071217(星期一) 下午3:304:00


摘要:With the development of World-Wide Web, storage and utilization of web data has become a big challenge to data management community. Though many commercial and academic tools emerge, the structure, content, and user behavior of Chinese Web is not fully studied. We are working on building a Chinese Web Infrastructure for support of such research. In this talk, the challenges of building such a system is analyzed, and our technical roadmap is discussed.


题目:XML Data Management Using Semantics in ORA-SS     

报告人:Ling Tok Wang, National University of Singapore (新加坡国立大学计算机学院教授)

时间:20071218(星期二)  下午200-330


摘要:Traditional semantic data models, such as the Entity Relationship (ER) data model, are used to represent real world semantics that are crucial for the effective management of structured data.

Today, semistructured data has become more prevalent on the Web, and XML has become the de facto standard for semi-structured data. A DTD and an XML Schema of an XML document only reflect the hierarchical structure of the semistructured data stored in the XML document. The hierarchical structures of XML documents are captured by the relationships between an Element and its attributes, and between an element and its subelements. Element-attribute relationships do not have clear semantics, and the relationships between elements and their subelements are binary. The semantics of n-ary relationships with n > 2 cannot be represented or captured correctly and precisely in DTD and XML Schema. Many of the crucial semantics captured by the ER model for structured data are not captured by either DTD or XML Schema. We solve these problems by using a semantic-rich data model called the Object, Relationship, Attribute data model for SemiStructured Data (ORA-SS).

In this talk, I will discuss how to use the semantics captured in the ORA-SS for XML data management, such as XML database design, object-relational storage for XML data, XML view creation and validation, XML graphical query language and output, XML query optimization, XML keyword search, etc.


题目:A Semantic Approach for Content Search and Content Extraction in XML Query Processing        

报告人:Ling Tok Wang, National University of Singapore (新加坡国立大学计算机学院教授)

时间:20071219(星期三)  下午200-330


摘要:Processing a twig pattern query in XML document includes structural search and content search. Most existing algorithms only focus on structural search. They treat content nodes the same as element nodes during query processing with structural joins. Due to the high variety of contents, mixing content search and structural search suffers from management problem of contents and low performance. Another disadvantage is to find the actual values asked by a query, they have to rely on the original document. In this talk, we propose a novel algorithm Value Extraction with Relational Table (VERT) to overcome these limitations. The main technique of VERT is using relational tables to store document contents instead of treating them as nodes and labeling them. Tables in our algorithm are created based on semantic information of the documents. As more semantics is captured, we can further optimize the tables and enhance the twig pattern querying processing.  We show by experiments that besides solving different content problems, VERT also has superiority in performance of twig pattern query processing compared with existing algorithms.


题目:Effective Semantics for Ranked Keyword Search in XML Database     

报告人:Ling Tok Wang, National University of Singapore (新加坡国立大学计算机学院教授)

时间:20071220(星期四)  下午200-330


摘要:Keyword search is a user-friendly way to query XML databases. Most previous efforts in this area focus on keyword proximity search in XML based on either tree data model or graph (or digraph) data model. Tree data model for XML is generally simple and efficient for keyword proximity search. However, it cannot capture connections such as ID references in XML databases. In the contrast, techniques based on graph (or digraph) data model capture connections, but are generally inefficient to compute. In this paper, we propose interconnected object trees model for keyword search to achieve the efficiency of tree model and meanwhile to capture the connections such as ID references in XML by fully exploiting the property and schema information of XML databases. In particular, we propose ICA (Interested Common Ancestor) semantics to find all predefined interested objects that contain all query keywords. We also introduce novel IRA (Interested Related Ancestors) semantics to capture the conceptual connections between interested objects and include more objects that only contain some query keywords. Then, a novel ranking metric, RelevanceRank, is studied to dynamically assign higher ranks to objects that are more relevant to a given keyword query according to the conceptual connections in IRAs. Experiment results show our approach outperforms most existing systems in terms of result quality. We have implemented a prototype of our ICRA system (ICRA = ICA + IRA) on the DBLP data.

转载自仲子说 [ http://www.wangzhongyuan.com/ ]


  1. 过客 said,

    2008年February26日 at 5:35




  2. 仲远 said,

    2008年February26日 at 13:14


  3. MiloX said,

    2017年June28日 at 23:30

    I see your page needs some fresh content. Writing manually takes a lot of time,
    but there is tool for this time consuming task,
    search in google; murgrabia’s tools

Leave a Comment

To prove you're a person (not a spam script), type the security text shown in the picture. Click here to regenerate some new text.
Click to hear an audio file of the anti-spam word