|
Adaptive Query Processing M.Tech Seminar Report
|
|
11-23-2010, 04:17 PM
(This post was last modified: 11-23-2010 04:17 PM by Academic_Project Reports.)
Post: #1
|
||||
|
||||
|
Adaptive Query Processing M.Tech Seminar Report
Abstract
Researchers are attempting to architect and implement a continuously adaptive query engine suitable for global-area systems and sensor networks. As query engines are scaled and federated, they must cope with highly unpredictable and changeable environments. Numerous limitations such as poor cost models, data correlations, changing system resources, changing data distributions, etc have surfaced in traditional cost-based query optimization. One promising technique to tackle these limitations is to abandon the optimize-then-execute model of query processing, but instead interleave optimization and execution in an adaptive fashion. There has been a lot of research on query processing in adaptive environment in recent years. Some propose evolutionary solutions such as changing query plans mid-flight, while others propose to do away with query plans altogether and instead route tuples adaptively. In this survey we throw some light on issues like what are the problems with query optimization, which methods address which problems, and in what environments is each method appropriate. We will assume basic familiarity with relational algebra, dynamic programming optimization, standard join algorithms and some machine learning concepts like entropy and gain ratio. 1. Introduction Adaptivity has been largely latent aspect of database research for the last few years. As computer systems scale up and federate, traditional techniques for system management should become more adaptive. The development of systems that could adapt gracefully and opaquely to changing data and environment are the need of time. Query optimization, with its attendant technologies for cost estimation is one of the most happening fields in different areas of computer systems. In the last few years, researchers have been exploring the design of systems that are adaptive to operate in unpredictable and changeable environments. 1.1 Problems with Traditional Cost-Based Query Optimization With the growth of databases as regards to size, variety and target environments, the traditional “optimize-then-Execute” model of query processing can result in sub-optimal execution of queries. Followings are some of the responsible factors: . Data Complexity and bad estimates: Selectivity estimation for static data sets is fairly well known, and there has been work on estimating statistical properties of static sets of data. But federated data often comes without any statistical summaries. Complex non-alphanumeric data types (e.g. XML) are now widely in use especially on the web. In such scenarios and even in traditional static relational databases selectivity estimates are often quite inaccurate. Also, it is difficult to maintain statistics on user defined functions. All these leads to suboptimal plans. . User Interface Complexity: In large-scale systems, many queries can run for a very long time. As a result, there is interest in allowing users to control properties of queries while execution. . Dynamic Environments: With the advent of internet and distributed systems there is heterogeneity in the types of data sources that a DBMS is supposed to handle e.g. locally stored tables, data streams, sensor networks etc. This poses various challenges like different data rates, unknown statistics about data, and variation in the distribution of data leading to change in the selectivities of operators, delayed data sources and so on. This also introduces continuous queries in addition to the normal discrete queries. . Complex queries: DBMS is supposed to cater to very complex queries especially in Decision Support domain. The performance of these complex queries is largely dependent on the order in which the direct joins are executed. . Queries with parameter variables: The query may contain parameter variables and the exact value of the parameter may be known only at runtime. e.g. Select * from Test where Test.id=’ID’. Here 'ID' may be a variable in the programming language in which this SQL statement is embedded. Here depending on the value that variable 'ID' assumes at runtime, the selectivity of the query changes and a particular selected plan may not suit all the possible selectivities. . Need for sharing work between queries – This scenario is especially applicable to continuous queries, which run forever, so optimizer has to share work among queries. . Emergence of new metrics for Query Processing: In applications such as On-line Aggregation user's preference can change dynamically and query is supposed to respond to it as early as possible. This kind of 'interactive' metric rules out static plans as it requires dynamically changing the order in which data is processed. . Change in the database state: Queries are optimized at compilation time. By the time the program is invoked and the plan is used the database state may change. This change in state may render the plan infeasible when some tables or indices referenced in the plan are deleted or there were bulk-loads which changes the data distributions. Final Year Projects, IEEE Projects, Engineering Projects, Science Fair Projects, Project Topics, Project Ideas, Major Projects, Mini Projects, Paper Presentations, Presentation Topics, IEEE Topics, .Net Projects, Java Projects, PHP Projects, VB Projects, SQL Projects, |
||||
|
« Next Oldest | Next Newest »
|
![]() |
|||||||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
User(s) browsing this thread: 1 Guest(s)
Search
Member List
Calendar
Help




![[-] [-]](images/collapse.gif)






