Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP

HP.com home

Business intelligence & advanced database research

» 

HP Labs

» Research
» News and events
» Technical reports
» About HP Labs
» Careers @ HP Labs
» People
» Worldwide sites
» Downloads
Content starts here
Server rack
 

Research opportunities

Data management is a growing challenge for businesses. Data warehouses are approaching the scale of hundreds of terabytes. Queries are growing in quantity and complexity -- and they're longer, too, often extending over many pages.

At the same time, today’s demanding computing workloads compel the use of parallel processing across potentially hundreds of servers. The complexity of all these processors raises massive potential problems in coordination and optimization. Managing the query workload alone is a major challenge. A data warehouse must field simple sub-second lookups with the best possible response times and simultaneously meet the deadlines for complex, hours-long batch queries.

Complicating matters further is the fact that new architectures continue to emerge -- challenging existing approaches to such basic concepts as memory hierarchy.

Research focus

Our research centers on large-scale database and information management, data warehouses, business intelligence, and parallel and data-intensive analytic computing.

In particular, researchers are working on technologies and ideas that transcend present platforms to enable very large, scalable enterprise databases and data warehouses using clustered parallel computing platforms. Our research interests include:

  • data warehouse workload management
  • multi-query optimization
  • multi-dimensional access
  • information visualization
  • data-intensive analytics
  • advanced database architecture

Current work

HP Labs is collaborating with HP’s product divisions and IT staff on an ambitious project: building one of the world’s largest enterprise data warehouses to solve real-world problems.

This data warehouse could well approach petabytes in size (equal to 1,000 terabytes or, by some estimates, as much data as exists in five Library of Congress book collections). It will consolidate 85 data centers in 29 countries into six facilities. Ultimately, the HP data warehouse will be designed to answer any question, any time, across all data subject areas and geographies.

We are working with global customers as well as HP’s IT staff to scale data warehouse technologies on highly parallel architectures. Our efforts include:

  • characterizing workload and automating workload management policies to meet service-level agreements
  • analyzing query streams to optimize reusability of query results
  • experimenting with novel data structures and access methods optimized for complex multi-dimensional queries and for increasing data availability
  • devising database algorithms that take better advantage of emerging processor, memory and storage capabilities
  • developing new data visualization techniques

Future applications

We are examining ways to add valuable non-conventional information to databases. Our work is beginning to allow users to add and manage information that normally wouldn’t be stored in a database: photos, satellite images, large-scale scientific data, and multimedia, even audio and video content.

We are also working on combining information-retrieval capabilities, typically applied to unstructured content such as text, with database query processing.

Data warehouse technology not only incorporates these rich forms of enterprise information, but relates them to more typical, structured information. It might, for instance, make it possible to associate satellite images or photos with a database entry to answer a question or further an ongoing project.

We're also exploring incorporating Web content into the mix, using machine-learning techniques to provide better structure and organization for this information.

Many enterprises confront scientific questions that require high-performance computing power. Currently, many such data-intensive problems cannot fit into existing memory. Our work is designed to enable more powerful applications that could benefit pharmaceutical manufacturers, energy companies and other institutions that regularly encounter this class of problems.

Information management

       
» Business intelligence & advanced databases
  » Information lifecycle management  
  » Content & metadata analysis and management  
  » Digital asset preservation  
  » Semantic Web  
       
 
 

Related research

»  Business intelligence & advanced database program
 

Learn more

»  HP business intelligence solutions
»  HP business intelligence services
Printable version
Privacy statement Using this site means you accept its terms Feedback to HP Labs
© 2009 Hewlett-Packard Development Company, L.P.