GnuCash
Contact   Instructions
Bug 798250 - Gnucash permanent hang on save while loading report
Summary: Gnucash permanent hang on save while loading report
Status: RESOLVED FIXED
Alias: None
Product: GnuCash
Classification: Unclassified
Component: Reports (show other bugs)
Version: 4.6
Hardware: PC Mac OS
: Normal critical
Target Milestone: ---
Assignee: reports
QA Contact: reports
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-07-17 08:58 EDT by Greg E
Modified: 2021-09-13 16:23 EDT (History)
4 users (show)

See Also:


Attachments

Description Greg E 2021-07-17 08:58:52 EDT
Summary: If user attempts to save changes to the database, followed by loading an existing report tab where the report is required to render before the save has completed, Gnucash permanently hangs with no output in trace file.

Steps to Reproduce:
1. Create a report in a tab (tested with Cash Flow, Income/Expense Bar, Portfolio Pie)
2. Move default focus on to any register
3. Save and close the file
4. Open the file  
5. Create a transaction in the register
6. Click 'Save'
7. Click to focus on the report tab

Expected Results: File saves and report renders.

Actual Results: Report fails to render, transaction fails to save, UI provides colorful beach ball spinner, trace file receives no warnings or errors.

Technical details:
MBP 2012 - 8 GB RAM
MacOS Catalina 10.15.7 (latest available for the hardware)
Gnucash Version: 4.6
Build ID: 4.6+(2021-06-26)
Finance::Quote: 1.47
Comment 1 John Ralls 2021-09-11 20:40:09 EDT
It appears to be a deadlock. The main thread is waiting for the save's xml-thread to continue; the xml-thread is stuck in a read of its input file descriptor:
* thread #32, name = 'xml_thread'
    frame #0: 0x00007fff203fccce libsystem_kernel.dylib`read + 10
libsystem_kernel.dylib`read:
->  0x7fff203fccce <+10>: jae    0x7fff203fccd8            ; <+20>
    0x7fff203fccd0 <+12>: movq   %rax, %rdi
    0x7fff203fccd3 <+15>: jmp    0x7fff203fd6bd            ; cerror
    0x7fff203fccd8 <+20>: retq
(lldb) up
frame #1: 0x0000000101b37e01 libgnc-backend-xml-utils.dylib`gz_thread_func(params=0x0000000125d0ed60) at io-gncxml-v2.cpp:1440:21 [opt]
   1437	    {
   1438	        while (success)
   1439	        {
-> 1440	            bytes = read (params->fd, buffer, BUFLEN);
   1441	            if (bytes > 0)
   1442	            {
   1443	                if (gzwrite (file, buffer, bytes) <= 0)
(

There are a bunch of WebKit threads sitting in g_main_loop waiting for an event to happen, that shouldn't be an issue.

And there's a guile garbage collection thread also stuck in a read:
(lldb) thr sel 21
* thread #21
    frame #0: 0x00007fff203fccce libsystem_kernel.dylib`read + 10
libsystem_kernel.dylib`read:
->  0x7fff203fccce <+10>: jae    0x7fff203fccd8            ; <+20>
    0x7fff203fccd0 <+12>: movq   %rax, %rdi
    0x7fff203fccd3 <+15>: jmp    0x7fff203fd6bd            ; cerror
    0x7fff203fccd8 <+20>: retq
(lldb) up
frame #1: 0x0000000100379a8c libguile-2.2.1.dylib`read_finalization_pipe_data + 28
libguile-2.2.1.dylib`read_finalization_pipe_data:
->  0x100379a8c <+28>: movq   %rax, 0x8(%rbx)
    0x100379a90 <+32>: callq  0x10042b41a               ; symbol stub for: __error
    0x100379a95 <+37>: movl   (%rax), %eax
    0x100379a97 <+39>: movl   %eax, 0x10(%rbx)

Mike Alexander, any ideas?
Comment 2 John Ralls 2021-09-12 19:28:44 EDT
More testing with some strategic printfs in gz_thread_func.
It's not actually blocking on the gc as I thought. Running the report takes over for a bit and ships off the report to the GtkWebKitWebView and the main thread resumes feeding bytes down the pipe which gz_thread_func dutifully sends off to gzwrite. With my test file the last chunk read off of the pipe is 64 bytes. If no report has run the next read gets 0 bytes terminates the thread, but if a report is run then gz_thread_func's read just sits waiting for more bytes.
Comment 3 John Ralls 2021-09-13 16:23:00 EDT
Found it, with some help from Simon Arlott. WebKit forks & execs as well as firing off threads and the fork was getting a copy of the pipe's file descriptors keeping the pipe from closing. Applying FD_CLOEXEC on them causes the fork's copies to immediately close so the pipe can when the save completes.

Note You need to log in before you can comment on or make changes to this bug.