GnuCash
Contact   Instructions
Bug 797481 - crash on close of unsaved tabs by pressing [X]
Summary: crash on close of unsaved tabs by pressing [X]
Status: RESOLVED FIXED
Alias: None
Product: GnuCash
Classification: Unclassified
Component: General (show other bugs)
Version: git-master
Hardware: PC Linux
: Normal major
Target Milestone: ---
Assignee: general
QA Contact: general
URL:
Whiteboard:
Keywords:
: 797518 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-11-04 13:15 EST by Ulla Selva
Modified: 2020-03-27 19:25 EDT (History)
5 users (show)

See Also:


Attachments

Description Ulla Selva 2019-11-04 13:15:25 EST
Reproducable crash:

1. Modify any booking, but don't save.

2. Close tab of account by pressing [x] in the titlebar.

Results in crash of gnucash.

( Workaround: save before pressing [x], but anyway it shouln'd crash... )

My Version is:
master-C3.7-176-g9214f2ed5-D3.7-15-gc52384e (flatpak on debian)

Please get in touch if you need more information.

And BTW: THX a lot for the great work on gnucash!
Comment 1 John Ralls 2019-11-04 13:45:16 EST
I'm not able to reproduce this in a normal build on Ubuntu, so possibly a flatpak problem.

Please clarify what you mean by "Don't save". Do you mean don't commit the transaction you've modified (so that you get the unsaved transaction dialog when you close the tab) or do you mean don't save the file after committing the transaction?

Do you get the crash on a freshly-created file?
Comment 2 Ulla Selva 2019-11-04 15:36:05 EST
By don't save I mean: 
I don't commit the transaction I've modified. Then I click the close [X] in the titlebar of the account. After that I get the "unsaved transaction dialog" when I close the tab. On click on "save transaction" gnucash crashes. 
( It doesn't crash when I first commit commit the change, and then press the close [X] ).

My userinterface is in German, so the labels of the buttons might be slightly different.

I now also tried working on a freashly creating file, also with a crash, but a diffenrent one: 
1. Click on Menue->new file
2. Click on cancel (not to create a complete hierarchy of accounts)
3. Click on the "new" button for new account
4. Type in an name for the account 
5. Click on "ok" results in immediate crash.

Might be a flatpak problem but since I use the latest flatpak-version and the crashes are reproducible that easily I thought it might be helpful anyway.
Comment 3 John Ralls 2019-11-04 16:08:51 EST
That one I can reproduce.

Please create a file hierarchy the normal way and see if trying to save an uncommitted transaction crashes in that circumstance.

The flatpak you reported up top is two months old. Have you updated to today's (gnucash-master-C3.7-353-gf89691f73-D3.7-68-g2a160e0 or gnucash-maint-C3.7-190-g491088b2f-D3.7-52-ga14e85f) since then?
Comment 4 Ulla Selva 2019-11-04 16:26:40 EST
I tried creating a hierarchy by clicking on "new file", and then on "continue" (for four times) until I get the button for "Anwenden" (in german, which probably would be something like "commit" in English). Clicking on that "commit"-button crashes gnucash as well.

Sorry, I was wrong with my version of gnucash (copy and past error doing to many things at the same time). Actually I'm on this version (this time I double checked it):

master-C3.7-353-gf89691f73-D3.7-68-g2a160e0

As far as I can see, this should be the newest flatpak version.
Comment 5 John Ralls 2019-11-04 17:34:44 EST
Aha! I get that one too, and it's the same one. It turns out that it's already fixed in maint and just needed to be merged into master, which I just did. You can either test again with today's maint flatpak or wait for tomorrow's master one.

As an aside, you do know that maint is the stable branch that we release from and that master is the unstable development branch for the next major release, 4.0?
Comment 6 Ulla Selva 2019-11-05 12:28:31 EST
Thanks for the (impressively quick) responses and the bug-fixing! And also for the clarifying of the difference between master and maint which indeed wasn't clear to me.

I'm now on maint-C3.7-191-g085aa7693-D3.7-52-ga14e85f and indeed don't get the crash on creating a new file anymore.

Anyway, I still get the first reported crash. Because the thread is now a little bit long I repeat the steps for reproducing this crash:

1. I modify a transaction but don't commit it. 
2. I click the close [X] in the titlebar of the account-tab. 
3. After that I get the "unsaved transaction dialog". There I click on "save transaction" which results in an immediate crash of gnucash. 

( Gnucash doesn't crash when I first commit the change, and only then press the close [X] ).
Comment 7 John Ralls 2019-11-05 13:58:43 EST
And that's the one I can't reproduce. You'll recall that I asked you to create a new file and to test in that. Does that also crash?

Can you run gnucash under gdb using the instructions at http://docs.flatpak.org/en/latest/debugging.html then crash GnuCash and get a stack trace (https://wiki.gnucash.org/wiki/Stack_Trace)?
Comment 8 Ulla Selva 2019-11-05 14:46:31 EST
Jes, it also crashes when I use a new file. 

Thanks for pointing me to gdb. I won't manage doing it today but I'll create the stack trace and get back with the result soon.
Comment 9 Ulla Selva 2019-11-06 16:42:57 EST
I just crashed it with a newly created file. Here is the stack-trace I've got:

Thread 1 "gnucash" received signal SIGSEGV, Segmentation fault.
0x00007f636645efeb in gnucash_sheet_get_block () from /app/lib/gnucash/libgncmod-register-gnome.so
(gdb) bt
#0  0x00007f636645efeb in gnucash_sheet_get_block () at /app/lib/gnucash/libgncmod-register-gnome.so
#1  0x00007f636645238e in gnc_item_edit_get_pixel_coords () at /app/lib/gnucash/libgncmod-register-gnome.so
#2  0x00007f6366452495 in  () at /app/lib/gnucash/libgncmod-register-gnome.so
#3  0x00007f636a8a8e58 in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#4  0x00007f636a8a9248 in  () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#5  0x00007f636a8a9572 in g_main_loop_run () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#6  0x00007f636351659d in gtk_main () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#7  0x00007f6366351b23 in gnc_ui_start_event_loop () at /app/lib/gnucash/libgncmod-gnome-utils.so
#8  0x0000564f82c99095 in  ()
#9  0x00007f636aa52c7d in  () at /app/lib/libguile-2.2.so.1
#10 0x00007f636aa3472a in  () at /app/lib/libguile-2.2.so.1
#11 0x00007f636aab610f in  () at /app/lib/libguile-2.2.so.1
#12 0x00007f636aabb9c9 in scm_call_n () at /app/lib/libguile-2.2.so.1
#13 0x00007f636aaaa0a7 in  () at /app/lib/libguile-2.2.so.1
#14 0x00007f636aa34d30 in  () at /app/lib/libguile-2.2.so.1
#15 0x00007f636aa34e15 in scm_c_with_continuation_barrier () at /app/lib/libguile-2.2.so.1
#16 0x00007f636aaa8b46 in  () at /app/lib/libguile-2.2.so.1
#17 0x00007f636a98cf25 in GC_call_with_stack_base () at /app/lib/libgc.so.1
#18 0x00007f636aaa8f28 in scm_with_guile () at /app/lib/libguile-2.2.so.1
#19 0x00007f636aa52e42 in scm_boot_guile () at /app/lib/libguile-2.2.so.1
#20 0x0000564f82c995c1 in main ()
Comment 10 John Ralls 2019-11-06 22:58:40 EST
Excellent. gnc_sheet_get_block is a blessedly short function that takes two arguments and checks only one of them  for NULL, so a band-aid fix would be to check the other one as well.

Better to spend a few minutes in the debugger though to see if I can figure out whwhat the optimized-out call in frame 3 is and what it might have to do with the NULL argument.

BTW, are you using Wayland or X11?
Comment 11 Ulla Selva 2019-11-07 08:43:32 EST
Nice that it seems to be that simple to find and to fix. And yeah, I could attribute something (at least a very very tiny bit...)!

I'm using X11 (Debian 9 with KDE).
Comment 12 John Ralls 2019-11-08 16:14:49 EST
Well, my first impression of where the problem might be was wrong as usual. I'll spare you the details of where it really is but I've added some protective code and made sure that everything that calls it handles the return value correctly.

Please try tomorrow's nightly flatpack.
Comment 13 Ulla Selva 2019-11-10 13:29:35 EST
Soryy.., unfortunately I still get the crash. The stacktrace looks the same as that above. Anyway I include an actual stack trace (you never know...).

I'm on flatpak-version:
maint-C3.7-208-g6f7c6b9de-D3.7-52-ga14e85f
(from 2019-11-10)

Thread 1 "gnucash" received signal SIGSEGV, Segmentation fault.
0x00007f6892535019 in gnucash_sheet_get_block () from /app/lib/gnucash/libgncmod-register-gnome.so
(gdb) bt
#0  0x00007f6892535019 in gnucash_sheet_get_block () at /app/lib/gnucash/libgncmod-register-gnome.so
#1  0x00007f6892528395 in gnc_item_edit_get_pixel_coords () at /app/lib/gnucash/libgncmod-register-gnome.so
#2  0x00007f689252849c in  () at /app/lib/gnucash/libgncmod-register-gnome.so
#3  0x00007f689697ee58 in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#4  0x00007f689697f248 in  () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#5  0x00007f689697f572 in g_main_loop_run () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#6  0x00007f688f5eb59d in gtk_main () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#7  0x00007f6892426b23 in gnc_ui_start_event_loop () at /app/lib/gnucash/libgncmod-gnome-utils.so
#8  0x000055d9122ad095 in  ()
#9  0x00007f6896b28c7d in  () at /app/lib/libguile-2.2.so.1
#10 0x00007f6896b0a72a in  () at /app/lib/libguile-2.2.so.1
#11 0x00007f6896b8c10f in  () at /app/lib/libguile-2.2.so.1
#12 0x00007f6896b919c9 in scm_call_n () at /app/lib/libguile-2.2.so.1
#13 0x00007f6896b800a7 in  () at /app/lib/libguile-2.2.so.1
#14 0x00007f6896b0ad30 in  () at /app/lib/libguile-2.2.so.1
#15 0x00007f6896b0ae15 in scm_c_with_continuation_barrier () at /app/lib/libguile-2.2.so.1
#16 0x00007f6896b7eb46 in  () at /app/lib/libguile-2.2.so.1
#17 0x00007f6896a62f25 in GC_call_with_stack_base () at /app/lib/libgc.so.1
#18 0x00007f6896b7ef28 in scm_with_guile () at /app/lib/libguile-2.2.so.1
#19 0x00007f6896b28e42 in scm_boot_guile () at /app/lib/libguile-2.2.so.1
#20 0x000055d9122ad5c1 in main ()
Comment 14 John Ralls 2019-11-10 17:27:55 EST
Poop.
Can you do it again but this time tell gdb `disass` instead of `bt`? That might help me to better figure out what it's trying to access.
Comment 15 Ulla Selva 2019-11-11 11:41:08 EST
Here you are:
(I'm now on maint-C3.7-210-g9189bcbe4-D3.7-52-ga14e85f)

Thread 1 "gnucash" received signal SIGSEGV, Segmentation fault.
0x00007f705b5f8019 in gnucash_sheet_get_block () from /app/lib/gnucash/libgncmod-register-gnome.so

(gdb) disass
Dump of assembler code for function gnucash_sheet_get_block:
   0x00007f705b5f7fb9 <+0>:     push   %rbp
   0x00007f705b5f7fba <+1>:     mov    %rsp,%rbp
   0x00007f705b5f7fbd <+4>:     sub    $0x30,%rsp
   0x00007f705b5f7fc1 <+8>:     mov    %rdi,-0x28(%rbp)
   0x00007f705b5f7fc5 <+12>:    mov    %rsi,-0x30(%rbp)
   0x00007f705b5f7fc9 <+16>:    cmpq   $0x0,-0x28(%rbp)
   0x00007f705b5f7fce <+21>:    jne    0x7f705b5f7ff4 <gnucash_sheet_get_block+59>
   0x00007f705b5f7fd0 <+23>:    lea    0x6156(%rip),%rdx        # 0x7f705b5fe12d
   0x00007f705b5f7fd7 <+30>:    lea    0x6a32(%rip),%rsi        # 0x7f705b5fea10
   0x00007f705b5f7fde <+37>:    lea    0x6156(%rip),%rdi        # 0x7f705b5fe13b
   0x00007f705b5f7fe5 <+44>:    callq  0x7f705b5e3b40 <g_return_if_fail_warning@plt>
   0x00007f705b5f7fea <+49>:    mov    $0x0,%eax
   0x00007f705b5f7fef <+54>:    jmpq   0x7f705b5f8093 <gnucash_sheet_get_block+218>
   0x00007f705b5f7ff4 <+59>:    mov    -0x28(%rbp),%rax
   0x00007f705b5f7ff8 <+63>:    mov    %rax,-0x10(%rbp)
   0x00007f705b5f7ffc <+67>:    callq  0x7f705b5e3da0 <gnucash_sheet_get_type@plt>
   0x00007f705b5f8001 <+72>:    mov    %rax,-0x8(%rbp)
   0x00007f705b5f8005 <+76>:    cmpq   $0x0,-0x10(%rbp)
   0x00007f705b5f800a <+81>:    jne    0x7f705b5f8015 <gnucash_sheet_get_block+92>
   0x00007f705b5f800c <+83>:    movl   $0x0,-0x14(%rbp)
   0x00007f705b5f8013 <+90>:    jmp    0x7f705b5f8050 <gnucash_sheet_get_block+151>
   0x00007f705b5f8015 <+92>:    mov    -0x10(%rbp),%rax
=> 0x00007f705b5f8019 <+96>:    mov    (%rax),%rax
   0x00007f705b5f801c <+99>:    test   %rax,%rax
   0x00007f705b5f801f <+102>:   je     0x7f705b5f803a <gnucash_sheet_get_block+129>
   0x00007f705b5f8021 <+104>:   mov    -0x10(%rbp),%rax
   0x00007f705b5f8025 <+108>:   mov    (%rax),%rax
   0x00007f705b5f8028 <+111>:   mov    (%rax),%rax
   0x00007f705b5f802b <+114>:   cmp    %rax,-0x8(%rbp)
   0x00007f705b5f802f <+118>:   jne    0x7f705b5f803a <gnucash_sheet_get_block+129>
   0x00007f705b5f8031 <+120>:   movl   $0x1,-0x14(%rbp)
   0x00007f705b5f8038 <+127>:   jmp    0x7f705b5f8050 <gnucash_sheet_get_block+151>
   0x00007f705b5f803a <+129>:   mov    -0x8(%rbp),%rdx
   0x00007f705b5f803e <+133>:   mov    -0x10(%rbp),%rax
   0x00007f705b5f8042 <+137>:   mov    %rdx,%rsi
   0x00007f705b5f8045 <+140>:   mov    %rax,%rdi
   0x00007f705b5f8048 <+143>:   callq  0x7f705b5e4670 <g_type_check_instance_is_a@plt>
   0x00007f705b5f804d <+148>:   mov    %eax,-0x14(%rbp)
   0x00007f705b5f8050 <+151>:   mov    -0x14(%rbp),%eax
   0x00007f705b5f8053 <+154>:   test   %eax,%eax
   0x00007f705b5f8055 <+156>:   jne    0x7f705b5f8078 <gnucash_sheet_get_block+191>
   0x00007f705b5f8057 <+158>:   lea    0x61ba(%rip),%rdx        # 0x7f705b5fe218
   0x00007f705b5f805e <+165>:   lea    0x69ab(%rip),%rsi        # 0x7f705b5fea10
   0x00007f705b5f8065 <+172>:   lea    0x60cf(%rip),%rdi        # 0x7f705b5fe13b
   0x00007f705b5f806c <+179>:   callq  0x7f705b5e3b40 <g_return_if_fail_warning@plt>
   0x00007f705b5f8071 <+184>:   mov    $0x0,%eax
   0x00007f705b5f8076 <+189>:   jmp    0x7f705b5f8093 <gnucash_sheet_get_block+218>
   0x00007f705b5f8078 <+191>:   mov    -0x2c(%rbp),%edx
   0x00007f705b5f807b <+194>:   mov    -0x30(%rbp),%ecx
   0x00007f705b5f807e <+197>:   mov    -0x28(%rbp),%rax
   0x00007f705b5f8082 <+201>:   mov    0x80(%rax),%rax
--Type <RET> for more, q to quit, c to continue without paging--c
   0x00007f705b5f8089 <+208>:   mov    %ecx,%esi
   0x00007f705b5f808b <+210>:   mov    %rax,%rdi
   0x00007f705b5f808e <+213>:   callq  0x7f705b5e4040 <g_table_index@plt>
   0x00007f705b5f8093 <+218>:   leaveq 
   0x00007f705b5f8094 <+219>:   retq   
End of assembler dump.

(gdb) bt
#0  0x00007f705b5f8019 in gnucash_sheet_get_block () at /app/lib/gnucash/libgncmod-register-gnome.so
#1  0x00007f705b5eb395 in gnc_item_edit_get_pixel_coords () at /app/lib/gnucash/libgncmod-register-gnome.so
#2  0x00007f705b5eb49c in  () at /app/lib/gnucash/libgncmod-register-gnome.so
#3  0x00007f705fa41e58 in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#4  0x00007f705fa42248 in  () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#5  0x00007f705fa42572 in g_main_loop_run () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#6  0x00007f70586ae59d in gtk_main () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#7  0x00007f705b4e9b23 in gnc_ui_start_event_loop () at /app/lib/gnucash/libgncmod-gnome-utils.so
#8  0x000055d7b0d8e095 in  ()
#9  0x00007f705fbebc7d in  () at /app/lib/libguile-2.2.so.1
#10 0x00007f705fbcd72a in  () at /app/lib/libguile-2.2.so.1
#11 0x00007f705fc4f10f in  () at /app/lib/libguile-2.2.so.1
#12 0x00007f705fc549c9 in scm_call_n () at /app/lib/libguile-2.2.so.1
#13 0x00007f705fc430a7 in  () at /app/lib/libguile-2.2.so.1
#14 0x00007f705fbcdd30 in  () at /app/lib/libguile-2.2.so.1
#15 0x00007f705fbcde15 in scm_c_with_continuation_barrier () at /app/lib/libguile-2.2.so.1
#16 0x00007f705fc41b46 in  () at /app/lib/libguile-2.2.so.1
#17 0x00007f705fb25f25 in GC_call_with_stack_base () at /app/lib/libgc.so.1
#18 0x00007f705fc41f28 in scm_with_guile () at /app/lib/libguile-2.2.so.1
#19 0x00007f705fbebe42 in scm_boot_guile () at /app/lib/libguile-2.2.so.1
#20 0x000055d7b0d8e5c1 in main ()
(gdb)
Comment 16 John Ralls 2020-01-03 16:23:51 EST
That nailed it: The crash was in GNC_IS_SHEET because sheet had already been freed. 

The deep underlying cause is that the register classes were written for Gtk1 and have not been updated to properly use GObject's memory management facilities. Rewriting the whole register is a bit more than I want to do for a maintenance bugfix, so I've made a band-aid fix with g_object_weak_reference() instead.
Comment 17 John Ralls 2020-01-21 11:23:21 EST
Morrand mistook bug 702880 for this one. Apparently there's a path somewhere in Scheduled Transaction handling that registers an idle event with gnc_item_edit_get_pixel_coords and an uncounted reference to gnucash_sheet, see the stack trace in attachment 373543 [details].
Comment 18 John Ralls 2020-03-27 18:04:52 EDT
*** Bug 797518 has been marked as a duplicate of this bug. ***
Comment 19 John Ralls 2020-03-27 19:25:09 EDT
Morrand's stack trace is
#0  0x0000000806b2903d in gnucash_sheet_get_block
    (sheet=0xaaaaaaaaaaaaaaaa, vcell_loc=...)
    at gnucash/register/register-gnome/gnucash-sheet.c:2189
#1  0x0000000806b1fccc in gnc_item_edit_get_pixel_coords
    (item_edit=0x82c3d6c20, x=0x7fffffffe3c4, y=0x7fffffffe3c0, w=0x7fffffffe3bc, h=0x7fffffffe3b8) at gnucash/register/register-gnome/gnucash-item-edit.c:232
#2  0x0000000806b20451 in gnc_item_edit_update (item_edit=0x82c3d6c20)
    at gnucash/register/register-gnome/gnucash-item-edit.c:260

That sheet=0xaaaaaaaaaaaaaa means that it isn't the sheet that's gotten freed, it's the GncItemEdit. Solution: handle the item_edit's "destroy" signal and release the idles in the handler.

With Morrand's instructions I was also able to reproduce the problem so I'm sure that it's now fixed.

Note You need to log in before you can comment on or make changes to this bug.