GnuCash
Contact   Instructions
Bug 748431 - Wrong average balance for transactions during DST
Summary: Wrong average balance for transactions during DST
Status: RESOLVED FIXED
Alias: None
Product: GnuCash
Classification: Unclassified
Component: Reports (show other bugs)
Version: unspecified
Hardware: Other All
: Normal normal
Target Milestone: ---
Assignee: reports
QA Contact: reports
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-04-24 16:46 EDT by Mosè Giordano
Modified: 2019-03-31 21:55 EDT (History)
4 users (show)

See Also:


Attachments
example reproducing the bug (3.60 KB, application/x-gzip)
2015-04-24 16:46 EDT, Mosè Giordano
no flags Details

Description Mosè Giordano 2015-04-24 16:46:50 EDT
Created attachment 302318 [details]
example reproducing the bug

Transactions dated when daylight saving time is in force result in wrong yearly average balance.

The attached account helps reproducing the error, just open Reports > Assets & Liabilities > Average Balance, and select Year as Step size.  The average balance for 2015 is 254.04$, instead of the expected integer 254$.

In some countries the yearly avarage balance is requested for tax return, so having GnuCash reporting the correct average balance would be really useful.

This is somewhat related to Bug 137017.
Comment 1 Christopher Lam 2019-03-14 23:02:01 EDT
I can confirm that the average-balance is time sensitive. The explanation comes from some mathematical definitions, and we need to delve into basic geometry and algebra to elucidate this.

Problem
-------
How do we convert a list of transactions into an average balance?

If, during a period (eg month) the account has a starting balance of $100, and receives $100 mid-month, the new balance is $200, and the ending balance is also $200. It is obvious to everyone that the average balance is $150.

If the starting-balance is $100, and an incoming $100 on the last day of the month, the average balance should intuitively be a hair above $100.

A balance chart in 2D: the 

     ^
     |
 200 +   -   -   -   -   -   -   - +------+
     |                             |//////|
 150 +   -   -   -   -   -   -   - |//////|
     |                             |//////|
 100 +-----------------------------+//////|
     |////////////////////////////////////|
  50 |////////////////////////////////////|
     |////////////////////////////////////|
   0 +-----------------------------+------+-->
     |
   start                         split   end

In this chart, 'start' is the period-start date, 'end' is the period-end date, and 'split' represents date for the late influx of $100.

The average is easily calculated by the formula

(split - start) * 100 + (end - split) * 200
-------------------------------------------
                end - split

And this is exactly what the original (analyze-splits) function aims to do.

Unfortunately, in gnucash, 'start' and 'end' are set as 'start-of-day' thanks to gnc:time64-start-day-time (i.e. 0:00 and set-tm:isdst -1 means 'unknown' :-O).

Split-posted dates are set in both the test suite and in regular books via (xaccTransSetDate txn DD MM YY) with split-posted-time sometime during the day, which seems TZ-dependent.

Hence the formula above will always show different values in different time zones.

To fix this I'm not sure how to proceed - I can artificially convert split-time to midday which seems the most sensible, or I can figure out my own averaging formula elsewhere. But any commit to fix this will *definitely* cause the average-balance figures to change slightly for most people. But after fixing the average-balance will at least be stable across all TZ. I hope.
Comment 2 John Ralls 2019-03-15 00:21:43 EDT
The time of posted-date is 10:59:00 UTC, chosen to produce the same date in most timezones, the exceptions being -12 (Baker Island) and +14 (Kirimati).

Report periods should begin at 00:00:00 on the beginning day and end at 23:59:59 on the ending day so that the period has the mathematical interval (begin, end). A report that ends on 00:00:00 of the ending day will have the interval (begin, end], leaving out the last day of the interval regardless of what time that day's transactions are posted.

Please start by making sure that intervals use gnc:time64-start-day-time to begin and gnc:time64-end-day-time to end. Grepping the reports directory for the two finds 21 calls of the former and 36 of the latter, so some cases are getting it right.
Comment 3 Christopher Lam 2019-03-15 06:32:11 EDT
Ok trying to debug this.

(gnc-ctime (time64-start-day-time time)) returns something like
"Sun Jun 01 00:00:00 1969"

(gnc-ctime (time64-end-day-time time)) returns
"Sun Aug 31 23:59:59 1969"

(gnc-ctime (xaccTransGetDate txn)) returns
"Wed Dec 31 18:59:00 1969" in my locale.

This means the start&end period times are sensitive to TZ. I think I'll need to find a mathematically sound strategy, by dividing time64 by 86400, and adjusting start/end periods by 0.5 -- this is the safest one
Comment 4 John Ralls 2019-03-15 10:48:30 EDT
Why? Unless the user is in a mid-Pacific island country the transaction time will be between 00:00:00 and 23:59:59 local time. 

Note, however, that the sample file has a posted time of 00:00:00 +0200. The save light isn't coming on, so the scrub that's supposed to fix that isn't working. It also means that GnuCash reports the date posted as 21 April to the west of +0200 and 22 April there and to the east.

Problem 1: The report appears to be using doubles to accumulate balances. That's OK for small balances like the example, but will introduce errors when there are more than 6 digits needed. Amounts should be either rationals or exact decimals with no rounding until presentation.

Problem 2: The report averages over seconds, collecting balance * interval. That the period start and end times are in local time and the transaction posted times are steady in UTC is what creates the time-zone variation. Note that this is true even in the example file: 2015-04-22 00:00:00 -0200 is just a weird way to write 2015-04-21 22:00:00 UTC. The obvious way to make this stable is to do the interval in UTC. I think that that would still produce unexpected results as average balances are generally calculated based on days rather than seconds. Doing an exact integer (/ interval 86400) (seconds in a day), adding a day to the last interval, will fix that and, since the remainder is discarded, removes the need for changing the timezone intervals.

That's the best we'll be able to do. The various countries that require reporting average balances will each have their own prescribed method of calculating those averages and GnuCash can at best satisfy only one of them.
Comment 5 Christopher Lam 2019-03-18 20:54:11 EDT
I had a solution being worked on until I unfortunately overwrote my main partition :-/

I think I know how to approach this; the test suite will set:
15-feb Income $100 (not a leap-year)
15-apr Income $100

and monthly averages from 31-jan onwards must be:
31-jan $0
28-feb $50
31-mar $100
30-apr $150

because I surmise that first 14 days of february will have an end-of-day balance of $0 and last 14 days will have balance of $100, and the average MUST be $50 exactly.

This confirms the current average-balance is currently buggy.

As soon as I've repaired my build I'll work on it. I'm not sure it'll be fixed for 3.5 though.
Comment 6 Christopher Lam 2019-03-18 20:57:59 EDT
Oops typo I meant
15-feb Income $100
16-apr Income $100

Note You need to log in before you can comment on or make changes to this bug.