GnuCash
Contact   Instructions
Bug 796967 - gnclock table not removed when using PostgreSQL
Summary: gnclock table not removed when using PostgreSQL
Status: RESOLVED FIXED
Alias: None
Product: GnuCash
Classification: Unclassified
Component: Backend - SQL (show other bugs)
Version: 3.3
Hardware: PC Windows
: Normal normal
Target Milestone: ---
Assignee: core
QA Contact: core
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-04 12:44 EST by Moshe Katz
Modified: 2018-12-28 16:20 EST (History)
3 users (show)

See Also:


Attachments

Description Moshe Katz 2018-12-04 12:44:38 EST
When using PostgreSQL as the backend, the `gnclock` table is created to lock the database but then is not removed when the program exits.  Every time I open GnuCash, I first need to manually run `DROP TABLE gnclock`.

(Note: This is not a duplicate of 635926 which describes the same problem but is 8 years old.)

Steps to reproduce:

1. Create a new GnuCash database backed by PostgreSQL
2. Close GnuCash
3. Reopen GnuCash and try to reopen the last "file"

Expected Results:

The file opens.

Actual Results:

An error message:

> The server at URL postgres://USER@SERVER:PORT/DBNAME experienced an error or encountered bad or corrupt data.


In the tracelog when closing the program, I found the following line:

> * 12:18:16  WARN <gnc.backend.dbi> [GncDbiSqlConnection::unlock_database()] No lock table in database, so not unlocking it.

I looked at the code where this error message appears, and I see that it is calling `get_table_list` with the `lock_table` name as an argument. It seems that `get_table_list` is not returning the existence of the table.

This seems to be the case in creating the lock file as well: the code there also checks for the existence of the lock table using `get_table_list`, in order to decide whether to create it again. However, even though the table exists, it is not being found, as evidenced by the following log entry on reopening the program:

> * 12:16:04  CRIT <gnc.backend.dbi> [error_handler()] DBI error: ERROR:  relation "gnclock" already exists

The code in `GncDbiSqlConnection::lock_database` is only supposed to run `CREATE TABLE ...` if the result of `get_table_list` is empty - obviously the table already exists in this case but the result of `get_table_list` is empty despite that. What **should** be happening here if the lock already exists is an error ERR_BACKEND_LOCKED, not a database error.

Unfortunately, my C++ skills are not good enough to do much more troubleshooting myself.
Comment 1 Moshe Katz 2018-12-04 12:47:37 EST
Forgot to add...

 - This has been happening at least since 3.1, and possibly earlier.
 - This happens on PostgreSQL 9.6.x and 10.x.
Comment 2 John Ralls 2018-12-04 18:17:03 EST
Check your pgsql logs for a query that looks like
"SELECT relname FROM pg_class WHERE relname !~ '^pg_' AND relname LIKE 'gnclock' AND relkind = 'r' AND relowner = (SELECT datdba FROM pg_database WHERE datname = 'gnucash') ORDER BY relname"

Where "gnucash" is really your database name. Are there any errors in the pgsql logs about that query? If you run that query from the pgsql shell does it find gnclock?
Comment 3 Moshe Katz 2018-12-05 09:29:40 EST
I found that query, and the problem appears to be the `relowner` portion of the query.

When I run just `SELECT relname FROM pg_class WHERE relname !~ '^pg_' AND relname LIKE 'gnclock' AND relkind = 'r' ORDER BY relname`, I get the expected result.

I would guess that this is happening because the database itself is owned by a "group" role and each of the (two) users has their own login which is a member of the group, so the database is owned by the group but the `gnclock` table is owned by the user who created it.

I ran `ALTER TABLE gnclock OWNER TO my_group_name` and now it works as expected.

Is there a reason for this `relowner` check?
Comment 4 Moshe Katz 2018-12-05 09:35:59 EST
Now that I know what to look for, I see that this is how the query is written in libdbi (https://sourceforge.net/p/libdbi-drivers/libdbi-drivers/ci/master/tree/drivers/pgsql/dbd_pgsql.c).

I'm going to ask about it there.
Comment 5 Moshe Katz 2018-12-05 09:49:52 EST
See https://sourceforge.net/p/libdbi-drivers/bugs/24/
Comment 6 John Ralls 2018-12-21 19:12:42 EST
No answer as I expected. The libdbi project isn't really maintained anymore.

I wonder, though, how you managed to get the lock table with a different owner in the first place: The database is created and populated in a single session so all of the tables should be owned by a single user. Unlock removes the entry but leaves the table. Did you at some point drop the lock table behind GnuCash's back?
Comment 7 Moshe Katz 2018-12-24 08:53:42 EST
If I remember correctly, I once had to drop the table manually after Windows Update forced a restart without GnuCash closing properly. However, I don't think that is what caused this issue.

As I mentioned earlier, this database is owned by a Postgres ROLE which has two members. I think the issue is related to that fact, because the database owner is the group the but the table owner was the *member* of the group.
Comment 8 John Ralls 2018-12-24 12:02:59 EST
Right, but all of the tables should have been created together in the first session and so had the same owner. Dropping the gnclock and then connecting as the other member would have recreated it owned by the other member and produced the problem.

GnuCash expects stale locks to happen sometimes, that's why the lock dialog has an "Open Anyway" button. 

I'm closing this as "won't fix" because I don't think we can fix it except by replacing DBI. That's in the long-term plan but won't happen until GnuCash 4.0 at the earliest.
Comment 9 John Ralls 2018-12-28 12:38:47 EST
I encountered a problem with overwriting a database that I think has the same root cause, so I'm going to change the PGSQL version of get_table_list to use a SELECT query instead of dbi_conn_get_table_list.
Comment 10 John Ralls 2018-12-28 16:20:37 EST
Fixed for GnuCash 3.4.

Note You need to log in before you can comment on or make changes to this bug.