GnuCash
Contact   Instructions
Bug 798085 - Incorrect transactions import of entries with large number amount
Summary: Incorrect transactions import of entries with large number amount
Status: RESOLVED FIXED
Alias: None
Product: GnuCash
Classification: Unclassified
Component: Import - QIF (show other bugs)
Version: 4.4
Hardware: PC Windows
: Normal critical
Target Milestone: ---
Assignee: import
QA Contact: import
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-13 10:41 EST by Ruben Cheng
Modified: 2021-01-19 09:22 EST (History)
4 users (show)

See Also:


Attachments
QIF file with test cases (1.34 KB, text/plain)
2021-01-13 10:41 EST, Ruben Cheng
no flags Details
Quicken entries screenshot (154.44 KB, image/jpeg)
2021-01-13 10:46 EST, Ruben Cheng
no flags Details
Screenshot of QIF Import results in gnucash (210.26 KB, image/jpeg)
2021-01-13 10:47 EST, Ruben Cheng
no flags Details
QIF Export (42.41 KB, image/jpeg)
2021-01-14 14:23 EST, Ruben Cheng
no flags Details
Tracefile log from pull request (15.27 KB, text/plain)
2021-01-15 10:01 EST, Christopher Lam
no flags Details
Exported QIF File with debits (1.80 KB, text/plain)
2021-01-18 12:21 EST, Ruben Cheng
no flags Details
QIF file with test cases around 43,949,672.00 (1.93 KB, text/plain)
2021-01-18 12:28 EST, Ruben Cheng
no flags Details

Description Ruben Cheng 2021-01-13 10:41:12 EST
Created attachment 373973 [details]
QIF file with test cases

Deposit transactions of large number, greater than 21,474,835.47 is incorrectly imported in gnucash, making incorrect amount.

Reading documentation of QIF file format in Wikipedia, "T" code is used for amount and it seems that it's the entry used by GNU cash to import amount. 

However, there is another code "U" for amount that according to Wikipedia has the same behaviour than "T" code. I saw that correct transactions in the QIF file both entry have the same number, but for amount greater than 21,474,835.47 the "U" code has the correct amount, and T code has not the correct amount.

It seems that the "T" code is limited to 32 bit signed float number. I think that GNUCash should use "U" code for import instead of "T" if exists.

The QIF file was exported from Quicken 2017
https://en.wikipedia.org/wiki/Quicken_Interchange_Format

All attached entries are deposits. It seems that retrieval is not affected.
Comment 1 Ruben Cheng 2021-01-13 10:46:20 EST
Created attachment 373974 [details]
Quicken entries screenshot

This screenshot list entries in quicken. Attached QIF file in the bug report is an export of the origin account.

This entries display correctly
Comment 2 Ruben Cheng 2021-01-13 10:47:38 EST
Created attachment 373975 [details]
Screenshot of QIF Import results in gnucash

This screenshot display incorrectly the amount of an import of QIF file
Comment 3 Derek Atkins 2021-01-13 11:11:29 EST
I will point out that every time there is "MFAIL" in there, the U and T are different values (and also T is negative but U is positive).  For example:

D1/13'21
U21,474,836.48
T-21474836.48
PTransfer
MFAIL
L[Target]

If both T and U are in the transaction, which should take precedence?  I would argue having both, and having them be different values, would be an error at best, but the behavior would necessarily be undefined.  So, IMHO, this should be closed as "INVALID" because the data is broken.

Another example of a "FAIL" transaction:

D1/13'21
U22,000,000.00
T-20949672.96
PTransfer
MFAIL
L[Target]

I don't know where these different values are coming from, but THAT is your problem with this QIF.
Comment 4 Ruben Cheng 2021-01-13 12:44:01 EST
An observation: 
MFAIL = Memo: FAIL (it's a comment that this will fail in the import)

>If both T and U are in the transaction, which should take precedence?  I >would argue having both, and having them be different values, would be an >error at best, but the behavior would necessarily be undefined.  So, IMHO, >this should be closed as "INVALID" because the data is broken.

I think that "U" should take precedence

>I don't know where these different values are coming from, but THAT is your >problem with this QIF.
This QIF file where generated by Quicken 2017 (I can attach the quicken test case file if needed). I Don't know if later version has the same issue and I don't have it.

By the way, I did a test. I swapped "U" and "T" entries between them (U code is now T, and the former T code will be U), and I worked!. It was imported with the correct amounts

It seems that the "T" code in quicken is limited to 32 bits, and "U" isn't
Comment 5 Derek Atkins 2021-01-13 13:19:33 EST
(In reply to ruben.cheng from comment #4)
>
> By the way, I did a test. I swapped "U" and "T" entries between them (U code
> is now T, and the former T code will be U), and I worked!. It was imported
> with the correct amounts
> 
> It seems that the "T" code in quicken is limited to 32 bits, and "U" isn't

I do not believe this is a 32-bit overrun issue (at least in GnuCash).

The issue is that you have QIF transactions that sometimes have the same value in the T and U fields, and sometimes have two different values in the T and U fields; GnuCash uses the T field.  So when the values are the same, the import appears to work.  But when the values are different, GnuCash chooses a different value than Quicken does.

So really, this is a problematic QIF file, and you should look into how it was created.

I reiterate my response that I do not believe this is a GnuCash import bug.
Comment 6 John Ralls 2021-01-13 21:08:18 EST
Derek,
Time for a two's complement refresher for you. Ruben is right: A 32-bit signed int will represent 0x7fffffff as -2147483648. 2200000000 is 0x83215600, and 32-bit signed int will represent that as -2094967296.

https://metacpan.org/release/Finance-QIF/source/lib/Finance has a note that the "U" key was introduced by Quicken 2005.
Comment 7 Ruben Cheng 2021-01-13 21:52:17 EST
I know that issue is Quicken QIF export not GnuCash. GnuCash QIF Import it correctly only If I swap manually "U"<->"T" keys, which is a workaround that I found. 

Wikipedia also mention that "U" key was introducted in Quicken 2005. I think that "U" key maybe is a new version of "T" key without number length limitation, "T" key was left for compatiblity.

The export dialog in Quicken has few options, and nothing is related.

I read that many people still sticks with Quicken 2017 like me, and not using later version due a change to agresive subscription price model starting 2018 version
Comment 8 Derek Atkins 2021-01-14 08:47:08 EST
(In reply to John Ralls from comment #6)
> Derek,
> Time for a two's complement refresher for you. Ruben is right: A 32-bit
> signed int will represent 0x7fffffff as -2147483648. 2200000000 is
> 0x83215600, and 32-bit signed int will represent that as -2094967296.
> 
> https://metacpan.org/release/Finance-QIF/source/lib/Finance has a note that
> the "U" key was introduced by Quicken 2005.

Thank you for the refresher.  Indeed, it does look like an overflow.
I suppose we should update the qif importer to prefer U over T if it finds both?
Comment 9 John Ralls 2021-01-14 12:41:18 EST
Yes. We could go a little further and have a C function that casts the U value to int32_t and checks equality with the T value, raising an error if it fails.

Does the qif-imp numeric parser know how to handle thousands and decimal separators? For that matter, does Quicken localize or is it en_US-only?

Ruben, what Region and Language do you use for Quicken?
Comment 10 Derek Atkins 2021-01-14 12:57:34 EST
(In reply to John Ralls from comment #9)

> Yes. We could go a little further and have a C function that casts the U value > to int32_t and checks equality with the T value, raising an error if it fails.

Sure, we could do that.

> Does the qif-imp numeric parser know how to handle thousands and decimal
> separators? 

Yes, it does.

>  For that matter, does Quicken localize or is it en_US-only?

I believe it is localized by Quicken, but the importer knows how to detect the different formats.
Comment 11 Ruben Cheng 2021-01-14 14:12:54 EST
(In reply to John Ralls from comment #9)
> Yes. We could go a little further and have a C function that casts the U
> value to int32_t and checks equality with the T value, raising an error if
> it fails.
> 
> Does the qif-imp numeric parser know how to handle thousands and decimal
> separators? For that matter, does Quicken localize or is it en_US-only?
> 
> Ruben, what Region and Language do you use for Quicken?

I'm using US Quicken 2017 in English
Comment 12 Ruben Cheng 2021-01-14 14:23:55 EST
Created attachment 373976 [details]
QIF Export

I'm attaching an screenshot of QIF Export of Quicken 2017 if you need it
Comment 13 Christopher Lam 2021-01-15 09:02:17 EST
Why use C when guile will work well?

https://github.com/Gnucash/gnucash/pull/878
Comment 14 Christopher Lam 2021-01-15 10:01:18 EST
Created attachment 373977 [details]
Tracefile log from pull request

Ok that PR didn't work well, see the full log. I don't understand why T=number whereas U=string.
Comment 15 Derek Atkins 2021-01-15 10:11:17 EST
Comment on attachment 373977 [details]
Tracefile log from pull request

Are you running 'U' through the "amount parser" regex?
Comment 16 John Ralls 2021-01-15 12:40:52 EST
(In reply to Christopher Lam from comment #13)
> Why use C when guile will work well?
> 
> https://github.com/Gnucash/gnucash/pull/878

Because Guile won't work at all: It uses gmp to represent integers so they are both infinite length and use a sign bit instead of two's complement.

SCM check_32_bit_wrap (SCM T, SCM U)
{
    int32_t t-value = scm_to_int32_t(T);
    int64_t u-value = scm_to_int64_t(U);

    if (t == (int32_t)u_value)
        return SCM_BOOL_T;

    PERR("%d is not a 32-bit representation of %" G_GINT64_FORMAT, t-value, u-value);
    return SCM_BOOL_F;
}

https://gmplib.org/manual/Integer-Internals
Comment 17 John Ralls 2021-01-17 12:00:49 EST
ruben.cheng, all of the transactions in the sample QIF are debits (increases) to the cash account. In the ones that don't wrap the T and U strings are the same, but in the ones that do wrap the T strings lack thousands separators as well as being different numbers. Chris is proposing a string comparison that depends on valid T strings being the same as the U string. To be sure that it's a general solution we need to see some credit transactions, some wrapping and some not. Please make another example showing that.
Comment 18 John Ralls 2021-01-18 11:37:49 EST
ruben.cheng, another thought: What happens when the amount is > 42,949,672.96? Does T wrap back around to a small positive as one would expect?
Comment 19 Ruben Cheng 2021-01-18 12:19:55 EST
(In reply to John Ralls from comment #17)
> ruben.cheng, all of the transactions in the sample QIF are debits
> (increases) to the cash account. In the ones that don't wrap the T and U
> strings are the same, but in the ones that do wrap the T strings lack
> thousands separators as well as being different numbers. Chris is proposing
> a string comparison that depends on valid T strings being the same as the U
> string. To be sure that it's a general solution we need to see some credit
> transactions, some wrapping and some not. Please make another example
> showing that.

Hi.... all transactions in the test cases are deposits (credits) I'll add an exported  QIF file that includes 5 debits (The last 5 transactions)
Comment 20 Ruben Cheng 2021-01-18 12:21:31 EST
Created attachment 373979 [details]
Exported QIF File with debits

QIF File exported from Quicken with test cases. The last 5 transactions are debits, the first two U and T are the same, the last 3 aren't
Comment 21 Ruben Cheng 2021-01-18 12:27:19 EST
(In reply to John Ralls from comment #18)
> ruben.cheng, another thought: What happens when the amount is >
> 42,949,672.96? Does T wrap back around to a small positive as one would
> expect?

I added a QIF test case with debits and credits around this value 42,949,672.96
Comment 22 Ruben Cheng 2021-01-18 12:28:19 EST
Created attachment 373980 [details]
QIF file with test cases around 43,949,672.00

QIF Test cases generated from Quicken with debits and credits with amount around 43,949,672.00
Comment 23 John Ralls 2021-01-18 14:03:15 EST
Thanks. No surprises, which is good.

One thing to remember, though: A bank account is your asset, so a deposit is a *debit*, not a credit. To the bank it's a liability--they owe you the money--so to them a deposit is a credit. Asset accounts increase with debits, Liability and Equity accounts increase with credits.
Comment 24 Christopher Lam 2021-01-18 16:54:09 EST
(In reply to John Ralls from comment #23)
> Thanks. No surprises, which is good.

IIUC it looks like the U value, if present, is always the correct one?
Comment 25 John Ralls 2021-01-18 23:50:42 EST
Yes.
Comment 26 Christopher Lam 2021-01-19 04:20:42 EST
https://github.com/Gnucash/gnucash/pull/878 merged in, will be fixed for 4.5 due end of March 2021

U values will override T values if they're both present.
Comment 27 Ruben Cheng 2021-01-19 09:21:55 EST
I'll change the subject because the issue affected both debits and credits

Note You need to log in before you can comment on or make changes to this bug.