[SQLiteJDBC] Why is this query answered instantly by the sqlite3 application, but slowly using sqlitejdbc?

Gary Helmstetter gh at theworld.com
Mon Mar 1 16:41:08 EST 2010


Brain,

Thank you for your interest and help!  I've been working on this  
continuously since receiving your note.  Detailed responses inserted  
below, in the hope that the day I've spent (all told) on this will not  
have to be repeated by others who run into it!

>
> On Sun, Feb 28, 2010 at 4:20 PM, Gary H <gh at theworld.com> wrote:
> I'm using Mac OS X v10.4.11 with the standard Java (1.5.0_19) and
> sqlite3 (3.1.3) that came it. (Yeah, a little old.)
>
> I have a sqlite3 database with a table with a few hundred thousand
> rows, with "name" and "stored" text columns. Name is one of six (so
> far) short strings; stored is a 19-character standard date-time
> string. Each column has a stand-alone index. (Essentially, the table
> tracks events in the life of 6 entities; its other columns describe
> the individual events.) The following query:
>
> > select distinct name from myTable where stored >= date("now");
>
> lists the relevant names instantly when I perform it through the Mac
> OS X "sqlite3" application. I get the same results, but it takes over
> 2 seconds to find each name (total of about 15 seconds), when I do the
> exact same query in the obvious way in my application:
>
> > String         q = "SELECT DISTINCT name FROM myTable " +
> >                     "WHERE stored >= DATE('now');" ;
> > try {
> >     Statement stmt = conn.createStatement();
> >     ResultSet rs = stmt.executeQuery(q);
> >     while (rs.next()) {
> >         final String s = rs.getString("symbol");
> >         System.err.println("Got " + s);
> >     }
> >     rs.close();
> > }
> I've tried this with both sqlitejdbc-v054 and sqlitejdbc-v055 . No
> perceptible difference.
>
> A wild speculation, based on the way it behaves, is that when the
> sqlite3 application does the query sqlite uses the indexes optimally
> (ie. applying "stored" first), but for some reason this doesn't happen
> when sqlitejdbc does the same query.
>
> Is this a known deficiency? If not, anyone have any suggestions how to
> attack it ?
>



On Mar 1, 2010, at 11:51 AM, Brain . wrote:

> With such a large time difference, it would be my guess that sqlite  
> is not using the index you want it to use.  It would be surprising  
> that sqlitejdbc would make a difference.

That's my feeling as well.  And with 30+ years of dbms experience,  
including internals, under my belt, it goes beyond surprising; it  
doesn't seem to make any sense at all.  Which means it's probably some  
kind of misbehavior at a low level - like a version skew, compiler  
bug, etc.

>  It's pretty much a small wrapper over the native functions.

Well, yes... but the way it's delivered and installed, it effectively  
contains a copy of sqlite... which may not (in this case, IS not)  
closely synchronized with that which created the database.

>
> Are you using the exact same database in both programs?
>  Of course, if you aren't then there could be differences in their  
> indexes.

Yes.  Positively the same database file.

>
> Sqlitejdbc uses sqlite 3.6.14.2.  So maybe that is why there is a  
> difference.

Indeed.  I had been assuming that newer versions of sqlitejdbc would  
work fine with older databases, but of course that's not necessarily  
the case!

Let me do some research... OK, I dug through the Crawshaw commit logs  
on github to determine the sqlitejdbc--sqlite version correspondences  
starting with sqlite 3.4.0.  It was a bit of work, so in case it has  
value to anyone ever again:

	commit date	sqlitejdbc	sqlite
	2007-06-18	034		3.4.0
	2007-12-30	038		3.5.4
	2008-02-13	040		3.5.6
	2008-04-16	044		3.5.8
	2008-05-29	047		3.5.9
	2008-08-12	053		3.6.1
	2008-12-04	054		3.6.3
	2009-04-06	054		3.6.10
	2009-06-09	055		3.6.14.2
	2009-06-09(2)	056		3.6.14.2


In the discussion below, for readability I call "Mac OS X 10.4" Tiger,  
and "Mac OS X 10.5" Leopard, and when I refer to "sqlite3" I'm talking  
about the Mac OS X command-line SQL interpreter application.  Tiger  
has sqlite3 v3.1.3, and Leopard has sqlite3 v3.4.0 .

I originally created my databases using Tiger's sqlite3.  I could  
create one with Leopard's and see if that works any better.  Taking  
this idea further, I could theoretically create my databases using my  
Java program, which would use sqlitejdbc v056 and thus sqlite  
3.6.14.2.   It seems highly likely the query would then work properly  
through sqlitejdbc.  However, in my particular situation there are  
three problems with that:

(1) It would be a bit of a structural hassle; my project creates its  
databases, and does various updates, by feeding a database and a text  
file of sql commands to the sqlite3 app.  To preserve that scheme I'd  
have to write a java program that provided the required sqlite3  
functionality.  That's not too hard....

(2) I'd then have a database that could probably only be accessed by  
my Java programs, since the Tiger or Leopard sqlite3 cannot be  
expected to be reliably forward-compatible with databases created by  
newer ones eg. 3.6.* . That is a show-stopper; I really need to be  
able to use Leopard's sqlite (and preferably Tiger's) to do ad-hoc  
queries and data maintenance.

(3) Even just using Leopard's sqlite3 to create the databases, I might  
have problems using them on my older iMacs; they have Tiger and I  
don't think they can be upgraded to Leopard.

>
> You have several options to debug this.
> 1.  You can try using the EXPLAIN command to try to figure out which  
> index it is using.  You can execute this as a query using  
> sqlitejdbc.  It might be a little tricky to get the output of it,  
> but it works.

I would have tried EXPLAIN if I thought it was possible to get its  
results through sqlitejdbc.  But assuming I did that and it confirmed  
that it used different (or no) indexes with sqlitejdbc, where would we  
go from there? (Answer below!)

> 2.  You can try using the INDEXED BY clause in your query to force  
> it to use an index that you want it to use or give an error why it  
> can't.
> http://www.sqlite.org/lang_indexedby.html

Great idea, that should either speed up the query (thus providing at  
least a temporary work-around), or give an error message that will  
shed more light on the problem.  Let me try it...

Eureka!  I added "INDEXED BY <name index>" after the table name in the  
query, and that modification caused it to run just as fast as when  
executed by sqlite3.


> 3.  http://www.sqlite.org/optoverview.html
> This page has several helpful sections such as the sqlite ANALYZE  
> command which will generate statistics about tables which helps it  
> choose which indexes it should use.

Sadly, after further experimentation:
  - Using "ANALYZE" to add the sqlite_stats1 table to a database  
created by Tiger sqlite3 doesn't help
  - Creating the database using Leopard sqlite3 doesn't help
  - Performing ANALYZE on a database created using Leopard sqlite3  
doesn't help

So it appears that the general problem is that new versions of sqlite  
(which are bundled into sqlitejdbc) do not optimize queries properly  
when operating against databases created by older versions of sqlite.

In my particular case I have a number of solutions available, all with  
uncomfortable issues, and the least uncomfortable is to add "INDEXED  
BY" clauses whenever I run into this problem.  The only other solution  
I can see is to download a "modern" sqlite3 source, build it locally,  
and put a private copy (so as not to screw up anything else they might  
use sqlite for) on the macs used by this project.  That's not much  
different than writing my own Java+sqlitejdbc implementation of the  
sqlite3 subset I need.

> I hope this helps.

Indeed it did.  Thanks again!

- Gary H



More information about the SQLiteJDBC mailing list