Posts Tagged: db2

Search for the XML Superstar

IDUG (the International DB2 Users Group) is sponsoring a worldwide contest initiative called The XML challenge – Search for the xml superstar. This contest aims to recognize developers (students or professionals) that create XML solutions, in one of the following categories: Video, Gadget, Query, PortableApp and XML Contest.

They are offering thousands of dollars in prizes, including Wiis, Zunes, iPods, Conference passes, Notebooks, GPS, etc…

If you live in US, you can submit your Video and Gadget entries until December 16th and 17th, respectively. The XML programming contest has also started and submissions will be accepted till January 31st.

For other countries, keep checking the website xmlchallenge.com for updates on your local contest.

SQLJ and JDBC

As a follow up on my last post comparing Static SQL with Dynamic SQL, I will now post an example of how to run the same code using static and dynamic SQL.

One of my visitors left a comment saying that the scope of static and dynamic SQL in Oracle is different than the one I mentioned. I am not familiar at all with Oracle, but was able to find some information on their documentation where they compare JDBC and SQLJ. Since their concept of static vs dynamic SQL is different from the concept in DB2, so my examples may not make sense for Oracle users. I also found out that although Oracle has had plans to desupport SQLJ in its data server, that support has been reinstated in their 10g release.

The two code samples I will show next are shipped with DB2 (get your free copy of DB2 Express-C) and can be found in the file %DB2FOLDER%/samples/java/sqlj/TbRead.java. I’ll just use one of the several examples in that file, that executes a sub-select statement in the employee table.

Sample code in SQLJ:

#sql cur7 = {SELECT job, edlevel, SUM(comm)
	FROM employeeWHERE job IN('DESIGNER', 'FIELDREP')GROUP BY ROLLUP(job, edlevel)};
while (true){
	#sql {FETCH :cur7 INTO :job, :edlevel, :commSum};
	if (cur7.endFetch()){
		break;
	}
	System.out.print("Job: " + job + " Ed Level: " + edlevel + " Tot Comm: " +commSum);
}

Sample code in JDBC:

Statement stmt = con.createStatement();
ResultSet rs = stmt.executeQuery("SELECT job, edlevel, SUM(comm) "
	+"  FROM employee "
	+"  WHERE job IN('DESIGNER','FIELDREP') "
	+"  GROUP BY ROLLUP(job, edlevel)");
	while (rs.next())
	{
	if (rs.getString(1) != null)
		{
		job = rs.getString(1);
		edlevel = rs.getString(1);
		commSum = rs.getString(1);
		System.out.print("Job: " + job + " Ed Level: " + edlevel + " Tot Comm: " +commSum);
		}
	}

Although both styles present different syntax, from a developer’s perspective, the only main difference is than when using JDBC one needs to explicitly fetch the row values into Java variables one by one. A common comment from Java developers is that SQLJ is not really Java (one needs to use annotations instead of java method calls), so they prefer to stick with JDBC.

Like I explained in my previous post, the biggest difference between these two styles (static SQL using SQLJ and dynamic SQL using JDBC) is that the SQL statements in the SQLJ files need to be compiled and bound to the database ahead of runtime. The following diagram illustrates this process:

staticSQL.jpg

After the deployment process, SQLJ execution is simpler than JDBC. While JDBC statements need to be prepared at execution time, SQLJ statements are already compiled and ready to use. The two following diagrams illustrate these differences:

jdbcstatement.jpgsqljstatement.jpg

As you can see, static SQL execution process is much simpler, but it requires a complex deployment process. This is an aspect of database development where there is a clash between DBAs and Developers. While ones – the DBAs – prefer the much more refined security and execution control provided by SQLJ and static SQL, others – the Developers – prefer the easier development process of dynamic SQL in the form of JDBC.

Soon, I will talk here about a new Java Data Access platform that supports the usage of both static and dynamic SQL at runtime (through a JVM property), allowing DBAs and Developers to use dynamic SQL on development and test environments and going with static SQL on the production environment. This way, the development community will get the best of both worlds: ease of deployment during development and testing phase and greater performance and control on the production environment.

If you are looking for a data management and application development tool, you should take a look at the new IBM Data Studio. It is an eclipse-based development environment, free to download and to use and with support to all major RDBMS. Download IBM Data Studio.

Static SQL vs Dynamic SQL

I have been wanting to write a few technical articles here on the blog, so I’ll start with something that is related with what I have been looking into at work: Static and Dynamic SQL.

From the conversations I have had with both DBAs and developers, it is clear that DBAs prefer static SQL, while developers prefer Dynamic SQL.

The difference between static and dynamic SQL is that static SQL needs to be compiled and bound to the database before application runtime, while dynamic SQL is compiled during runtime. Next, I’ll show a list of pros and cons regarding each one. 

Static SQL

Pros:

  • compile at bind time. Since the statement is compiled only once and before we run our workload, we have all the database resources in order to generate the most optimal query execution plan. In DB2, there are 9 levels of optimization, being 5 the default one. When we bing our application package, we can pick the highest optimization level – 9 – and get the most optimal execution plan. Using a higher optimization level requires more resources for the compile phase, but since our workload is not yet running, we can afford this high resources requirement.
  • security. Security is probably the most common reason why people use static SQL instead of dynamic SQL. Static SQL allows the DBA to set authorization at the package level. For example, consider an application package app1, that provides SQL functionality to select employee’s name and address from the table employees. The DBA can five user JOHN execution privileges on package app1, even if user JOHN does not have SELECT authority on table employees. Static SQL provides a much finer layer of security.

Cons:

  • need to bind before runtime. Although binding before runtime usually allows for more optimized access plans, doing this in a test or development environment can be cumbersome.
  • lack of tooling support. most of current IDEs provide coding assistance with support for APIs like JDBC. The lack of support from development tools discourages the use of static SQL.

Dynamic SQL

Pros:

  • IDEs and APIs: using eclipse to develop Java code that interacts with the database using JDBC or JPA is much simpler than developing a SQLJ application.
  • statement caching. Dynamic statement caching avoids the need to compile the same statement multiple times, increasing the performance to values close to static SQL. However, bear in mind that a cache miss will be extremely expensive.
  • better statistics. Because the statement is compiled at runtime, it uses the latest statistics available, contributing to a better execution plan.

Cons:

  • compile at runtime. There are a few reasons why compile at runtime can be a bad thing:

    • every time a statement is executed, it needs to be compiled, increasing the total statement execution time
    • the compile time will account for the total execution time, so using higher optimization levels may slow down the overall performance instead of improving it.
    • because the statement is only compiled at runtime, errors in the SQL statement won’t be detected until runtime.

As you can see, there are several reasons why you would choose one over the other. There is no perfect solution! But if you ask me, I would suggest the following: use Static SQL if security is your main concern and use Dynamic SQL if ease of development is your main concern.

Persisting XML with OpenJPA

I’ve been looking at JPA – Java Persistence Architecture and decided to play a bit with OpenJPA using DB2 as the back-end. My goal: to persist and query XML data in DB2, making use of DB2’s pureXML capabilities to query the XML data using SQL/XML‘s XMLQUERY() function.

However, while OpenJPA has an extensive documentation, the examples are not always complete and there isn’t a lot of information on the web regarding OpenJPA error determination and solving. So, here are some recommendations for some of the problems I have encountered. The class xml.Address is the one to be persisted as XML in the database using JPA and it is stored as the field shipAddress of the Order objects.

 [java] Exception in thread "main" <openjpa-1.0.0-r420667:568756 fatal user error>
org.apache.openjpa.persistence.ArgumentException:
Type "class xml.Address" does not have persistence metadata. 

Suggestion: Remove the reference to xml.Address from persistence.xml

[java] Exception in thread "main" <openjpa-1.0.0-r420667:568756 nonfatal user error>
org.apache.openjpa.persistence.InvalidStateException:
Encountered unmanaged object "xml.Address@9b2a51" in
persistent field "xml.Order.shipAddress" of managed object "xml.Order@12b3349"
during flush.  However, this field does not allow cascade persist. 
You cannot flush unmanaged objects.
  [java] FailedObject: xml.Address@9b2a51

Suggestion: Make sure you have no @Entity or @Embeddable annotations in xml.Address. The main annotation is @XmlRootElement.

 [java] Exception in thread "main" <openjpa-1.0.0-r420667:568756 fatal general error>
org.apache.openjpa.persistence.PersistenceException:
"xml" doesnt contain ObjectFactory.class or jaxb.index

Suggestion: add a file jaxb.index into your xml package containing all the classes to be persisted as XML: a file containing Address in our case.

DB2 on Rails update

I’m back to fiddling around with my Ruby on Rails experiments(1)(2). I was able to create a very useful 2-way mapping between Ruby objects and xml data stored in DB2 pureXML. Basically, trying to replicate some of ActiveRecord’s functionality but for XML data. I still find it odd, though, that both ROXML and xml-mapping haven’t had much activity as of late. I’m wondering if there is any new OXM library around that I don’t know of.

Also on the same topic:

  • the main DB2 on Rails website is up and running again, with a revamped design and now using wordpress instead of typo.
  • a new version of the ibm_db driver was also released, containing several bugfixes. Update it through gems (gem update ibm_db) or from here: http://rubyforge.org/projects/rubyibm/ 

DB2 on Mac

Last week, after my presentation at University of Minho about the DB2 on Campus and DB2 Student Ambassador programs and the pureXML features in DB2, one student came to me and asked me if DB2 was available for Mac. My answer was a ‘no’, but things will change pretty soon. 

My ‘office neighbor’ Antonio Cangiano just made public IBM’s intention of releasing a DB2 Express-C port to Intel Mac. This is one more big step from DB2 Express-C towards the community, after very open licensing conditions, Ruby on Rails driver and adapter, soon-to-come Python and Django driver and adapter, DB2 Express-C orum and DB2 Express-C blog, etc….

 

XML and Databases

I just stumbled across an excellent resource regarding XML technology in databases. Ronald Bourret has the most extensive research I’ve seen on the global XML and databases state of the art. It has an extensive list of databases with XML support (native or by means of extenders/adapters) recently updated and several papers on XML and databases

A must read, that I will be consuming over the coming weeks. 

Recent readings

How to get an access plan in DB2 using db2exfmt

The most common way used to generate and retrieve access plans in DB2 is by using DB2 Control Center. CC provides a graphical representation of a query’s access plan, and it also includes an Index Advisor, that you can use when you are not sure about which indexes to create and to use.

However, not always you have access to a graphical environment (needed to run DB2 Control Center). For the command line, there are two utilities that you can use. DB2 EXPLAIN command and the db2exfmt utility. Although the first one is more complete, I find the second one to be easier to use. So, in order to get an access plan for your query using db2exfmt, you just need to do:

  • db2 -tvf ~/sqllib/misc/EXPLAIN.DDL (create the explain tables where all the explain data will be stored)
  • db2 set current explain mode explain (this will put DB2 in explain mode
    and all the subsequent queries won’t be run, but explain data will be
    gathered)
  • run your workload (e.g., db2 -tvf query.sql)
  • db2exfmt -d dbname -1 -o output.txt (formats the information on the explain tables)

 Detailed information about the access plan for your query will be in the file output.txt. By using this information, you can see which indexes are being used or not as also other performance considerations about your query.

PS: don’t forget to run "db2 set current explain mode no" when you are done with your access plans, so the queries will be executed. 

Weekend reading…

dwtopstory.jpgTop Story on DeveloperWorks… it should be interesting 😉

Full article here.

For the curious and the database people, this is a long awaited article exploring the performance differences between the new DB2 pureXML storage and the alternatives to manage XML data, like storing it as a CLOB or decomposing it.