How do you store data?
Welcome to the latest news from TalkIT. This issue will look at storing and managing data. Software drives the world, but software needs data. How to get the most out of relational databases? How to migrate data across systems? What are NoSQL databases? Please add your comments at the bottom of this blog. These are just some thoughts, based on 12 years experience developing databases. As usual, I would really like to hear your ideas. Contents The Basics Relations are Important An Alternative A Connected World Can you solve this coding problem? Other Bits
The Basics
In the April newsletter I asked the question: how do you access data? There was a massive response to this question with lots of comments on Linked In Groups. In historic shift, the world’s knowledge is moving from paper, books and libraries to databases. How is this happening? Let us start with the basics. Storage started with simple flat files containing multiple records of data. Each record consists of a number of fields. Each field holds the data values in the form of numbers, text or dates. Software applications then access this data to solve useful problems. Think of a payroll application processing an employee file. Soon this flat file approach became limited. The real world provided more data and more complex data structures. This led to the evolution relational databases, where many related tables replaced a single flat file. These tables could now also store audio, image, video data.
Relations are Important
Relational Database Management Systems (RDBMS) have been in widespread use for about 25 years. They have a solid foundation in a branch of mathematics called set theory. The leading suppliers of database software include Oracle, Microsoft, SAP and IBM. Oracle is the biggest supplier.
A database contains one or more tables. Each table holds the raw data in rows & columns, structured rather like a spreadsheet. Connected data is now stored in multiple related tables. This allows applications to access the data in fast and effective ways. These tables can have different relationships including:
- One to one
- One to many
- Many to one
For example in a library database, the subjects table has a one to many relationships with the books table. This is because for each subject, there will be many books. The database schema maps these tables and their relationships. Standard Query Language (SQL) allows data to be created, updated, queried and secured. This widely used programming language lets developers work directly with data. SQL is powerful and fast evolving language. It allows developers to work in a very versatile way. It also has the advantage of not being specific to any one provider. But different providers use slightly different flavours of the language. It also has some limitations, for example it is not an Object Oriented Language. Take a look at the TalkIT courses and free tutorials in SQL Server. http://talk-it.biz/course-products/sql-server/ http://talk-it.biz/tutorial-categories/sql-server/
An Alternative
Recently a radically different db architecture has been gaining popularity. These are NoSQL databases. This has been partly driven by social media apps. These apps need constant fast access to small bits of data. Think of a the constant updates to a twitter feed. Here is a definition:
A NoSQL database environment is, simply put, a non-relational and largely distributed database system that enables rapid, ad-hoc organization and analysis of extremely high-volume, disparate data types. NoSQL databases are sometimes referred to as cloud databases, non-relational databases, Big Data databases and a myriad of other terms and were developed in response to the sheer volume of data being generated, stored and analyzed by modern users (user-generated data) and their applications (machine-generated data). In general, NoSQL databases have become the first alternative to relational databases, with scalability, availability, and fault tolerance being key deciding factors. They go well beyond the more widely understood legacy, relational databases (such as Oracle, SQL Server and DB2 databases) in satisfying the needs of today’s modern business applications. A very flexible and schema-less data model, horizontal scalability, distributed architectures, and the use of languages and interfaces that are “not only” SQL typically characterize this technology.
For more see: Planet Cassandra
A Connected World
In a connected world we need integrate many diverse data stores. A large company manages data from:
- Legacy applications
- Data Warehouses
- Call centres
- Mobile data
Data often has to be migrated, transformed, analysed and reported on. To do this Microsoft provides these Business Intelligence components as part of SQL Server
- Reporting Services (SSRS)
- Analysis Services (SSAS)
- Integration Services (SSIS)
SSIS allows data to be moved between diverse data stores, with the option of restructuring the data. Each SSIS process is stored and executed as a package. The package contains a sequence of data tasks. A package typically inputs data from a source, applies a transformation, then outputs to a destination. The same process can often be accomplished by writing ad-hoc SQL statements. But it is more effective to manage complex migrations using reusable SSIS packages. These can be tested, stored the shared across the team. So SSIS provides more control of data migration. Take a look at the TalkIT course in SSIS http://talk-it.biz/course/introduction-sql-server-integration-services-ssis-2012/
Can you solve this coding problem?
Here is a coding puzzle. The aim is to write a short and elegant program in a language of your choice. This time we are working with poker. Assume a standard 52 card pack. When a five card poker hand is dealt, what is the probability of getting:
- Any two of a kind, say two Kings?
- Any three of a kind say three Kings?
- Any four of a kind, say four Kings?
Interesting, but how do we code this? To calculate the odds we need to divide each pattern by the total number of ways of drawing 5 cards. The results could give you an advantage next time you play. For poker probabilities see: Hawaii University
Other Bits
TalkIT has created its own e-learning platform to train software developers. This is aimed at anyone who wants to learn a new technology or language from novice to advanced level. You will learn step-by-step how to build business applications and databases. On the TalkIT website there are currently over 70 tutorials. The topics include:
- SQL Server Development
- SQL Server Administration
- Visuals Studio
- Programming in C# and VB using OOP
- Net MVC & Web Forms
- Windows Forms
- Windows Presentation Foundation
- Windows Communication Foundation
- Data Access with ADO.Net, LINQ and the Entity Framework
- HTML 5
- CSS3
- JavaScript
Launch offer until end December 2014 – only £4.99 per course subscription or £9.99 for subscription to all courses. Many of the shorter tutorials are free. To find a tutorial just click: http://talk-it.biz/training/tutorials To find out more look at the FAQ page: http://talk-it.biz/training/tutorials/tutorials-faq/ More courses are comong soon. We will be publishing Java, C++ and Python courses in the next few months. If you have any questions about this please contact me.
Photos www.freedigitalphotos.net/
David Ringsell 2014 ©