Created attachment 373862 [details] Crash log generated when File->Open used to change to another Book This morning I entered a single transaction in Book A, which had been running overnight, clicked "Save", and after a pause clicked File->Open and selected Book B. Gnucash shut down, without opening Book B, and when a started GnuCash again, Book A was opened, with a message that no new Scheduled Transaction had been created. "Save" followed by a further attempt at opening Book B failed. On the third attempt, Book A was opened at first, with no messages, but the same result on an attempt to open Book B.
Further investigation: I tried using File->New after starting GnuCash and opening Book A, as before, but this caused a crash. I then renamed Book A, started GnuCash, and when Gnucash was unable to find the last-successfully-opened file I used File->Open to open Book B - and then Book C and Book D (I needed to override the LCK file warnings). Restoring Book A's name allowed me to open all Books without further difficulty.
Michael, that crash report is really strange. Can you attach the other two as well?
Created attachment 373864 [details] Crash log generated when File->Open used to change to another Book
Created attachment 373865 [details] crash log generated when File->Open used to change to another Book
I've sent the first two crash reports from yesterday. When I opened the Console App this morning, I found three more crash reports all date-stamped 16/09/2020, 07:51. Gnucash was left running overnight, but I hadn't used it at all today. I haven't attached these reports, but a quick look reveals that the datestamps within the logs themselves give yesterday's date, with times ranging from 10:49 to 10:52, i.e. a few minutes after those you already have.
The renaming-of-Book-A procedure I described above doesn't survive a Mac OS restart. I restarted my Mac yesterday evening, and after opening Book A (the most recently used) GC I caused a crash when I tried to File->Open Book B. I've confirmed that the renaming procedure has worked again this morning.
All three are the same crash, so I suppose that the rest are too. This appears to be an idle event with a stale page reference firing in the report render code. Does "Book A" include one or more open reports? If you close them all does it still crash when you switch books?
Yes - it did. No, once the reports were closed, Book B (or C or D) can be opened without a problem.
Thanks. I'm pretty sure that means that for some reason a report is trying to render and there's a race between the idle event firing and the page getting freed. Was the open tab in book A a register or report one?
It was a register. I've tried to remember what was being reported, but it eludes me now!
(In reply to Michael Hendry from comment #10) > It was a register. Interesting. I was expecting that it would be a report because I think that reports are rendered only if the tab is focussed. > I've tried to remember what was being reported, but it eludes me now! Uh-oh. Does that mean you can't reproduce the crash now?
I may need to set up a register, shut down GC then reboot and try the sequence again - not practicable at this moment, but I'll try tomorrow.
For developers, won't make sense to users: The only way I can see this crashing is if the plugin_page gets unreffed without calling gnc_plugin_page_destroy_widget first. The cleanest fix would be to rewrite all of gnc_plugin_page to implement dispose and finalize, taking care of the cleanup in whichever is appropriate (dispose in this case). Having to call destroy_widget before calling unref rather misses the point of ref and unref.
*** Bug 797948 has been marked as a duplicate of this bug. ***
@BobIT, can you see a way for the idle event to leak when the report tab is destroyed? Unfortunately like Michael the reporter on bug 797948 can't get it to crash having removed the report that was causing it, even after re-creating the report.
@John, I have had a look at this and can not see how this bug is being caused. I tried on my Linux VM and Windows10 PC switching between files with reports in open tabs to try and make it fail but was unsuccessful. Also added some print statements to see what order functions are being called and nothing stood out there. I am a little confused that the crash reports seem to indicate that it failed trying to load a report from the gnc_plugin_page_report_load_uri function but could not see how that could cause the issue. @Michael, As the report configurations for open tabs are stored in the 'file name'.gcm file, I was wondering if you may of had a backup for these 'Book A' and 'Book B' .gcm files from before the 15/09/2020. If you do, maybe you can try and replace the current one, making sure you have a copy and see if that makes it start to fail again. The file locations are explained here https://wiki.gnucash.org/wiki/Configuration_Locations which I think points to HOME/.local/share/gnucash/books/...
OK, I've found the most recent Time Machine backups of A.gnucash.gcm (dated 15th September 2020) and B.gnucash.gcm (dated the 9th). Do I need to work with A.gnucash and B.gnucash files with the same dates? or will the current ones suffice? Michael
Michael, I think the quickest way to see if the error would occur is... 1) start Gnucash and make sure file 'A' is the one that starts. 2) close Gnucash 3) copy the existing two .gcm files to a safe place 4) replace the existing two .gcm files from your backup 5) start Gnucash which should open 'A' and confirm it has report tabs open 6) Now do File->Open to open 'B' as you reported If it fails, can you confirm which reports were left open in tabs for both 'A' and 'B'. To get back to normal, with Gnucash closed replace both .gcm files with the ones saved in step 3). If this causes the same failure, maybe John can provide some Mac specific commands to provide more detail.
I think it would be wise to note all of the open tabs, including order and which one has focus in file A. It won't hurt to have the same for file B but I don't think it matters as it's crashing while closing down A. Even a File>New should cause the crash. Not Mac-specific but running with --log gnc.gui=debug might provide some clues. gnc-plugin-page-report.c isn't well instrumented but there are a few info log messages. Bob, is it possible that the GncPluginPageReport is getting freed by g_object_unref in a way that bypasses gnc_plugin_page_report_destroy_widget?
OK. 1. I'm not using my live files at all for this test, and am using a separate CrashTest directory for the .gnucash files. 2. My file A.gnucash is a copy of the current version of the one whose closure initiated the crash. 3. My file B.gnucash is a copy of the current version of the file which I was File->Opening in place of A. 4. I have made copies of the 15th September A.gcm and B.gcm files in the appropriate Application Support directory. When I open A.gnucash there is one Register tab, which is a Search Results report with a single transaction from 2015. I'd been looking for any contributions made to "Save The Children" following a recent hacker warning from the US branch of this charity, and had used Cmd-F from the Accounts tab so as to include all possible sources of the cash for this donation. Unfortunately, File->Open B.gnucash doesn't reproduce the crash. I'll try a few variations - for example, I realise that neither A.gnucash nor B.gnucash was visible in the recently-opened list, and I need to Quit gnucash to get that quick access to B.gnucash. I'll also hunt down the most recent A.gnucash and B.gnucash files prior to the crash.
Michael, that's at odds with comment 8 in which you said that book A had report tabs and that closing them prevented the crash.
Indeed. I've just been reading through bug report 797948, and spotted a reference to HomeBrew. I've never used HomBrew to install GnuCash, but at about this time I did some updating of HomeBrew, which I hadn't touched for some months. Is it possible that GnuCash was depending on a resource provided by HomeBrew, and that resource had been changed during a run of GnuCash?
No. As I pointed out in comment 1 to that bug, the Homebrew GnuCash formula, https://formulae.brew.sh/api/cask/gnucash.json, just downloads the dmg from https://github.com/gnucash/gnucash. Everything Gnucash.app is built from source using jhbuild on my Mac and I don't use Homebrew.
I've managed to recreate the crash by restoring the MDH.gnucash file from backup (dated 9th September 2020) and MDH.gcm from 10th September, renaming them A.gnucash and A.gcm respectively, and moving A.gnucash to a test directory. This time it occurred when I started up Gnucash normally (opening the most recently opened file) and then used File->Open to open ./CrashTest/A.gnucash. To my inexperienced eye, the crash report looks the same.
Created attachment 373879 [details] Crash log generated when File->Open used to change to another Book
Michael, Yes, it's the same stack trace. Is your A & B nomenclature the same as before? Have you restarted GnuCash since this crash? If not, please attach the tracefile. If yes, did GnuCash open A or whatever was previously open? If GnuCash opened the previous file then a couple of other experiments come to mind: Start GnuCash from Terminal with --nofile and try to open A. Does that crash? What about starting by specifying A on the command line?
Yes, the nomenclature is the same - MDH.gnucash is my personal file, and this is what I called Book A in the original report, and it was RotaryCharity.gnucash (Book B) I was opening instead when the original crash occurred. I've made copies of these files (and their .gcm files) and called them A.gnucash, A.gnucash.gcm, B.gnucash and B.gnucash.gcm, and it's these files that I'm using for these tests. I'd already opened other files using Gnucash after the crash, and I now find I can't recreate the crash (using the files dated 9th and 10th September). Where would I find the tracefile, in the event of a further crash?
Sorry, I thought you knew about tracefiles already: https://wiki.gnucash.org/wiki/Tracefile. Could you attach A.gcm--the one recovered from TimeMachine--after munging anything sensitive?
Created attachment 373880 [details] .gcm file associated with the crash As requested
I've located the tracefile - empty.
It gets overwritten every time you run GnuCash (except on Windows) so it's only useful right after the crash.
I've tried the command-line approach: /Applications/Gnucash.App/Contents/MacOS/Gnucash --nofile but haven't managed to reproduce the crash, which isn't coming up using the GUI route either. Before each attempt, I copy the Time-Machine-recovered A.gnucash and A.gnucash.gcm into the test directory. Grasping at straws here, but I wonder if this could be related to pressure on system memory. I use a couple of Parallels virtual machines (one for legacy Windows programs and one for Ubuntu 20.04) which are memory-hungry. Alos, when I use Lightroom, I find it's a real memory-hog. I know nothing of the inner workings of memory management on the Mac, but wonder if the opening of a new file by GnuCash could cause memory pressures if some greedy apps are already running?
PS This comes up when I use the --nofile option: (? significance) | => /Applications/Gnucash.App/Contents/MacOS/Gnucash --nofile Application Path /Applications/Gnucash.App/Contents/MacOS/Gnucash (process:6849): gnc.gui-WARNING **: 11:17:29.052: [mac_set_languages()] Language list: en:en_GB:C (Gnucash:6849): Gtk-WARNING **: 11:17:57.748: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node header owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:17:57.748: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tabs owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:17:57.748: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:17:57.811: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node header owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:17:57.811: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tabs owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:17:57.811: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:17:58.301: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:17:58.449: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:17:58.576: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:17:58.707: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:17:58.819: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:29:05.275: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:29:06.104: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:29:06.591: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:29:06.719: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:29:06.848: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook) (Gnucash:6849): Gtk-WARNING **: 11:29:06.958: Drawing a gadget with negative dimensions. Did you forget to allocate a size? (node tab owner GtkNotebook)
Reproduced the crash again this afternoon. GC wasn't running, I started it from the dock and A.gnucash was opened (as the most recently opened). Various Scheduled Transactions were due, so I authorised those, clicked save, and when the activity indicator stopped I used File->Open to select the Rotary Charity Account (which was Book B in the original scenario). The crash log appears the same as all the others, and the trace log is empty but recently "touched": -rw------- 1 michaelhendry staff 0 5 Oct 16:33 gnucash.trace
Physical memory pressure might be involved if it slows down execution enough to make worse a race condition, bearing in mind that the event-loop race condition is an unproven hypothesis at this point. The "allocate" in those Gtk error messages isn't about allocating memory, it's about some component of GtkNotebook that doesn't specify its dimensions properly when it's created. The latest crash matches pretty well the original scenario, which I suppose is reassuring. Yesterday's (comment 25) seemed backwards in that you started GnuCash with a different file (was it the Rotary account or a third one and did it have any open reports?) and crashed when loading A.gnucash. You could run from the commandline with --debug to get a fuller tracefile but that might get tedious given that it doesn't crash every time.
Michael and I have been working on this directly. I've built a couple of GnuCashes with more debug statements and he's crashed it with those statements turned on. Here's what we've found: The report whose load is crashing is (in the latest run anyway) Reports>Experimental>Balance Sheet Multicolumn The idle event is getting set, then gnc_plugin_page_report_destroy_widget is called and g_idle_remove_by_data returns true. gnc_plugin_page_report_dispose (I added a minimal implementation as a check) and gnc_plugin_page_report_finalize are called shortly after the widget is destroyed. The main window is destroyed. gnc_plugin_page_report_load_uri is called and crashes in GNC_PLUGIN_PAGE_REPORT because the memory was freed. There's only one way that gnc_plugin_page_report_load_uri can be called and that's the idle event that was supposedly removed, so it's clear that g_idle_remove_by_data isn't working. My first thought was to pass a weak pointer to gnc_plugin_page_report_load_uri, but as usual for GObject weak pointers (and they have 3 different kinds!) aren't thought through well enough to actually work. Bob Fewell, any ideas?
Created attachment 373898 [details] Change report loading The first thing was why we used the idle and this goes back to 2016 when the charts would not use the space correctly. Now we are using Gtk3 and chartjs I was thinking that it may not be required. By just calling gnc_plugin_page_reort_load_uri directly from the realize function this seemed to work but unfortunately the report pages then get rendered when the splash screen is shown. So as a test, with the enclosed patch the report rendering is now definitely when the page is focused with the status bar showing progress. This is achieved by checking whether the main window is shown, if it is when the page focus widget function is called the report is created. I was trying to do this all in the report plugin page, but saving the file and then loading another file the main window is already available and hence as each page is created so are the reports so maybe this check will need to be added to GncMainWindow with a function to get state. I will look at that later...
I made a test build with your patch applied, bundled it, and put it up for Michael to download, which he has done. He was fortuitously in a consistently-crashing state and the test GnuCash started up and didn't crash when he switched files. So a preliminary success.
I have added my revised fix to PR808
Oh, I was going to just wait a few (more) days to make sure that Michael didn't get any more crashes and if he didn't push it to maint.
The changes made I think simplify the fix as it no long relies on a call back to see when the main window is shown but in essence still does the same thing to load the reports directly from the focus_widget function and do away with the g_idle_add.
Despite all my efforts, I've failed to produce any more crashes.
That's good, right? Bob, go ahead and close this when you merge PR 808.
Pushed PR808 to maint so closing, fix will be in version 4.3
*** Bug 798007 has been marked as a duplicate of this bug. ***