- This event has passed.
noSQL, SQL, and mo’SQL – Big Data Complexities for Sci. Comp. in Oil and Gas
September 22, 2014 @ 6:45 pm
noSQL, SQL, and mo’SQL – Big Data Complexities for Scientific Computing in the Oil and Gas Industry
David M. Butler, PhD
President and Founder
Limit Point Systems, Inc.
<b style=”font-size : 16px”>Agenda
*** Bring ID (e.g. Driver’s License) for eBay Security ***
6:30 Doors Open, Food & Networking
*** Please arrive by 7 PM due to Security ***
The explosion of interest in the so-called “noSQL” category of tools for managing ”Big Data” has finally reached such a crescendo that even the Wall Street Journal is reporting on it. And for good reason – there are commercially important applications where the complexity and effort of establishing a relational schema and using SQL just doesn’t pay off. The structure of the data and the access patterns in the application just aren’t complex enough, or understood well enough, to justify the effort.
But this clearly isn’t the whole picture. There are classes of applications for which the Structured Query Language isn’t structured enough. The data universe in these applications is more structured, more complex, than the tables, rows, and columns provided by the relational data model. The category of applications generically referred to as “scientific computing” is a major example. These applications produce Big Data at whatever size the infrastructure will bear. Most of this data represents what a physicist would call a “field”, that is, some property that depends on space and/or time. Not to be confused with what a data base administrator might call a field, a physics field has a rich mathematical structure, including topological, geometric, and algebraic features, that can not be usefully described by only the tables of the relational data model. Hence, what scientific computing needs is a /more/ Structured Query Language that can directly describe all this mathematical structure. To continue the naming theme established by noSQL, we’ll call this language “mo’SQL”.
We begin this talk by briefly reviewing the definition and meaning of “data model” and the strategic role a data model plays in the design and implementation of data management software. We introduce the notion of the “data model spectrum” describing the mathematical structures captured by various data models, show where scientific computing fits in this spectrum, and derive a set of requirements for mo’SQL. We continue by introducing the sheaf data model, a fundamental mathematical data model designed to meet these requirements. We describe various features of the data model and their implementation in the open source SheafSystem™ using examples drawn from the oil and gas industry – a major consumer of scientific computing. We end with a brief description of current work in collaboration with Prof Magne Haveraaen at the University of Bergen’s Language Design Laboratory to define mo’SQL as a query language for the sheaf data model.
President and Founder of Limit Point Systems, Inc. Dave has a PhD in Physics and more than 30 years experience in developing “systems for processing scientific data” in a wide range of application areas. His work has emphasized research and development in data models for scientific computing, software architecture, and programming in C++ and a variety of other languages.