From icrazy at gmail.com Thu Aug 28 09:48:14 2008 From: icrazy at gmail.com (Kuang He) Date: Thu, 28 Aug 2008 12:48:14 -0400 Subject: [sac-dev] A possible way to fix "WARNING: Number 4003 integer too large" Message-ID: Hi there, I met this "WARNING: Number 4003 integer too large" when I tried to multiply my data by a factor of 10^9, and found in the changelog that it is a known bug. I took a look at the corresponding code snippet in src/ucf/cnvati.c starting from L163: /* - Build integer from the stored digits. */ ifac = 1; *intgr = 0; for( j2 = j1; j2 >= 1; j2-- ){ /* -- Warning.. Approaching maximum integer size. * - Use a fudge factor of 100k to test present integer value.*/ if( (ifac == 1000000000) || (*intgr >= (MLARGE - 100000)) ){ if ( lstrict ) { /* lstrict added. maf 970129 */ cmicnv.icnver = 4003; setmsg( "WARNING", cmicnv.icnver ); apcmsg( "integer too large\a", 19 ); outmsg(); clrmsg(); } } else{ *intgr = *intgr + Ni[j2]*ifac; } ifac = 10*ifac; } *intgr = isign**intgr; In my opinion, instead of using magic numbers 1000000000 and 100000 above, it might be better to use strtol() to convert strings to longints. strtol() can also show if the resulting value is out of range (using errno ERANGE). Based on the linux manpage of strtol() I've been reading, this function conforms to SVr4, 4.3BSD, C89, C99 and POSIX.1-2001, so there should be no portability problems of using this at all . Best regards, -- Kuang He Department of Physics University of Connecticut Storrs, CT 06269-3046 Tel: +1.860.486.4919 Web: http://www.phys.uconn.edu/~he/ From savage at uri.edu Thu Aug 28 10:01:12 2008 From: savage at uri.edu (Brian Savage) Date: Thu, 28 Aug 2008 13:01:12 -0400 Subject: [sac-dev] A possible way to fix "WARNING: Number 4003 integer too large" In-Reply-To: References: Message-ID: Kuang He You are correct in that this routine contains a bug and you are again correct that it will be changed to use the strtol() routine in future releases. I have routines ready to replace this one and the one in cnvatf.c as well. This is major change to the code as it is heavily used and the ramifications of changing it need to be dealt with properly. Thanks for your interest. If you have other suggestions or code/bug fixes we would be glad hear them. Cheers Brian On Aug 28, 2008, at 12:48 PM , Kuang He wrote: > Hi there, > > I met this "WARNING: Number 4003 integer too large" when I tried to > multiply my data by a factor of 10^9, and found in the changelog that > it is a known bug. > > I took a look at the corresponding code snippet in src/ucf/cnvati.c > starting from L163: > > /* - Build integer from the stored digits. */ > ifac = 1; > *intgr = 0; > for( j2 = j1; j2 >= 1; j2-- ){ > /* -- Warning.. Approaching maximum integer size. > * - Use a fudge factor of 100k to tesKt present > integer value.*/ > if( (ifac == 1000000000) || (*intgr >= (MLARGE - > 100000)) ){ > if ( lstrict ) { /* lstrict added. maf > 970129 */ > cmicnv.icnver = 4003; > setmsg( "WARNING", cmicnv.icnver ); > apcmsg( "integer too large\a", 19 ); > outmsg(); > clrmsg(); > } > } > else{ > *intgr = *intgr + Ni[j2]*ifac; > } > ifac = 10*ifac; > } > *intgr = isign**intgr; > > In my opinion, instead of using magic numbers 1000000000 and 100000 > above, it might be better to use strtol() to convert strings to > longints. strtol() can also show if the resulting value is out of > range (using errno ERANGE). Based on the linux manpage of strtol() > I've been reading, this function conforms to SVr4, 4.3BSD, C89, C99 > and POSIX.1-2001, so there should be no portability problems of using > this at all . > > Best regards, > > -- > Kuang He > Department of Physics > University of Connecticut > Storrs, CT 06269-3046 > > Tel: +1.860.486.4919 > Web: http://www.phys.uconn.edu/~he/ > _______________________________________________ > sac-dev mailing list > sac-dev at iris.washington.edu > http://www.iris.washington.edu/mailman/listinfo/sac-dev From icrazy at gmail.com Thu Aug 28 14:40:24 2008 From: icrazy at gmail.com (Kuang He) Date: Thu, 28 Aug 2008 17:40:24 -0400 Subject: [sac-dev] Fix for the missing equation in the help file of 'envelope' Message-ID: Hi, An equation is missing from the help file of 'envelope': sqrt(x(n)^2 + y(n)^2). Here is the fix (I've formatted the paragraph after inserting the missing equation): $ diff -u aux/help/envelope.old aux/help/envelope --- aux/help/envelope.old 2008-08-28 17:33:18.000000000 -0400 +++ aux/help/envelope 2008-08-28 17:34:10.000000000 -0400 @@ -9,9 +9,9 @@ DESCRIPTION: This command computes the envelope function of the data in memory. The -envelope is defined by: where x(n) is the original signal and y(n) its -Hilbert transform (see HILBERT.) As with HILBERT, very long period data -should be decimated (see DECIMATE) prior to processing. +envelope is defined by: sqrt(x(n)^2 + y(n)^2), where x(n) is the original +signal and y(n) its Hilbert transform (see HILBERT.) As with HILBERT, very +long period data should be decimated (see DECIMATE) prior to processing. HEADER CHANGES: DEPMIN, DEPMAX, DEPMEN Best regards, -- Kuang He Department of Physics University of Connecticut Storrs, CT 06269-3046 Tel: +1.860.486.4919 Web: http://www.phys.uconn.edu/~he/ From snoke at vt.edu Fri Aug 29 10:52:35 2008 From: snoke at vt.edu (Arthur Snoke) Date: Fri, 29 Aug 2008 13:52:35 -0400 Subject: [sac-dev] A possible way to fix "WARNING: Number 4003 integer too large" In-Reply-To: References: Message-ID: <48B83763.9060905@vt.edu> Thanks for pointing this out. My old manual has the expression, but with a square root sign and superscripts. Here is what will be in 101.2. DESCRIPTION: This command computes the envelope function of the data in memory. The envelope is defined by the square root of x(n)^2 + y(n)^2, where x(n) is the original signal and y(n) its Hilbert transform (see HILBERT). As with HILBERT, very long period datashould be decimated (see DECIMATE) prior to processing. Kuang He wrote: > Hi there, > > I met this "WARNING: Number 4003 integer too large" when I tried to > multiply my data by a factor of 10^9, and found in the changelog that > it is a known bug. > > I took a look at the corresponding code snippet in src/ucf/cnvati.c > starting from L163: > > /* - Build integer from the stored digits. */ > ifac = 1; > *intgr = 0; > for( j2 = j1; j2 >= 1; j2-- ){ > /* -- Warning.. Approaching maximum integer size. > * - Use a fudge factor of 100k to test present integer value.*/ > if( (ifac == 1000000000) || (*intgr >= (MLARGE - 100000)) ){ > if ( lstrict ) { /* lstrict added. maf 970129 */ > cmicnv.icnver = 4003; > setmsg( "WARNING", cmicnv.icnver ); > apcmsg( "integer too large\a", 19 ); > outmsg(); > clrmsg(); > } > } > else{ > *intgr = *intgr + Ni[j2]*ifac; > } > ifac = 10*ifac; > } > *intgr = isign**intgr; > > In my opinion, instead of using magic numbers 1000000000 and 100000 > above, it might be better to use strtol() to convert strings to > longints. strtol() can also show if the resulting value is out of > range (using errno ERANGE). Based on the linux manpage of strtol() > I've been reading, this function conforms to SVr4, 4.3BSD, C89, C99 > and POSIX.1-2001, so there should be no portability problems of using > this at all . > > Best regards, > From icrazy at gmail.com Sat Aug 30 18:33:29 2008 From: icrazy at gmail.com (Kuang He) Date: Sat, 30 Aug 2008 21:33:29 -0400 Subject: [sac-dev] sacswap's byte swapping bug Message-ID: Hi, I found a problem with sacswap. Although nowadays with sac's ability to read both big- and little-endian sac files, sacswap must be less often used, some people I know are still using it since some of the legacy software such as pssac [3] can only deal with big-endian sac files. PROBLEM: I used sacswap on my linux system to convert a random sac file vel.sac (little-endian) to vel.sac.swap (big-endian), and then from vel.sac.swap (big-endian) to vel.sac.swap.swap (little-endian). By doing a binary file comparison, it turns out that files vel.sac and vel.sac.swap.swap are not the same. I read files vel.sac and vel.sac.swap into matlab and found that 8 or so data points from them were not the same, i.e. the differences were first introduced in the first sacswap conversion. I've confirmed that all the file headers from the two files are the same, and NPTS of vel.sac is more than 10,000, so the error rate is pretty small. The sac file I used is here, in case you want to try it: http://maxwell.phys.uconn.edu/~icrazy/sac/vel.sac To narrow the problem down, I extracted one of the numbers that was changed after doing sacswap conversions and made a small program to repeat the problem. The program is here: http://maxwell.phys.uconn.edu/~icrazy/sac/endian1.c I found that the culprit is this function float_swap() below, which according to [1] is not a good way to implement the byte swapping feature. float float_swap(char cbuf[]) { union { char cval[4]; float fval; } f_union; f_union.cval[3] = cbuf[0]; f_union.cval[2] = cbuf[1]; f_union.cval[1] = cbuf[2]; f_union.cval[0] = cbuf[3]; return(f_union.fval); } This program endian1.c takes four bytes (0xff, 0xae, 0x47, and 0xc5) as input and then swap the bytes (1st with 4th, 2nd with 3rd). The correct output should be: ff ae 47 c5 --> c5 47 ae ff However, in my system (Pentium 4 CPU + GCC 4.2.3), using different optimization parameters with gcc, I got different results: $ gcc endian1.c -o endian1 && ./endian1 ff ae 47 c5 --> c5 47 ee ff (wrong, "ae" was mysteriously changed to "ee") $ gcc -O2 endian1.c -o endian1 && ./endian1 ff ae 47 c5 --> c5 47 ee ff (wrong, "ae" was mysteriously changed to "ee") $ gcc -O3 endian1.c -o endian1 && ./endian1 ff ae 47 c5 --> c5 47 ae ff (correct) I also found that this behavior is system dependent. I think it might only be dependent on the gcc version, but I'm listing the CPU as well. Systems which only give correct byte swapping results when using gcc -O3: - Intel(R) Pentium(R) 4 CPU 2.40GHz gcc (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7) - AMD Athlon(tm) MP 2800+ (2133MHz) gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52) Systems which give correct byte swapping results not matter what: - Intel(R) Xeon(R) CPU E5320 @ 1.86GHz gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) - Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) FIX: Ref [2] (page 9) gives some example byte swapping macros, which in my opinion are equivalent to the byteswap() function in sac's source code src/ucf/byteswap.c . I took this function and adapted my program endian1.c to endian_byteswap.c below: http://maxwell.phys.uconn.edu/~icrazy/sac/endian_byteswap.c and everything works! I also did a complete update of sacswap: http://maxwell.phys.uconn.edu/~icrazy/sac/sacswap.c using byteswap(), while trying to keep as much original code intact as possible. P.S. I don't know about the attachment policy of this mailing list. If needed (maybe for archiving purposes), I can attach all the program files mentioned here in a reply post. [1] http://www.power.org/devcon/07/Session_Downloads/PADC07_McQuaid_PA_Wrobel_Heinz_20070910.pdf [2] Endianness White Paper http://www.intel.com/design/intarch/papers/endian.pdf [3] pssac http://www.eas.slu.edu/People/LZhu/downloads/pssac.tar Best regards, -- Kuang He Department of Physics University of Connecticut Storrs, CT 06269-3046 Tel: +1.860.486.4919 Web: http://www.phys.uconn.edu/~he/ From george at gly.bris.ac.uk Sun Aug 31 05:06:23 2008 From: george at gly.bris.ac.uk (George Helffrich) Date: Sun, 31 Aug 2008 13:06:23 +0100 Subject: [sac-dev] sacswap's byte swapping bug In-Reply-To: References: Message-ID: <809f9750f3be254f6bf33b06bac4e24a@gly.bris.ac.uk> Dear Huang - Good work tracking this down. What it shows is that you are running into compiler problems! Even though your reference [1] says that using unions to swap is bad, there is actually nothing wrong with it. It might be inefficient on hardware, and, though [1] does not say why, there are two possible problems that could arise: 1) loss of efficiency in swapping due to sign extension; 2) alignment problems in the union. I suspect that the problem you are having is due to botched optimization and/or sign extension problems. Can you make the following test? Change your routine like so, and run your tests: float float_swap(unsigned char cbuf[]) { union { unsigned char cval[4]; float fval; } f_union; f_union.cval[3] = cbuf[0]; f_union.cval[2] = cbuf[1]; f_union.cval[1] = cbuf[2]; f_union.cval[0] = cbuf[3]; return(f_union.fval); } That rules out sign extension as a contributor to the problems. On 31 Aug 2008, at 02:33, Kuang He wrote: > Hi, > > I found a problem with sacswap. Although nowadays with sac's ability > to read both big- and little-endian sac files, sacswap must be less > often used, some people I know are still using it since some of the > legacy software such as pssac [3] can only deal with big-endian sac > files. > > PROBLEM: > > I used sacswap on my linux system to convert a random sac file vel.sac > (little-endian) to vel.sac.swap (big-endian), and then from > vel.sac.swap (big-endian) to vel.sac.swap.swap (little-endian). By > doing a binary file comparison, it turns out that files vel.sac and > vel.sac.swap.swap are not the same. I read files vel.sac and > vel.sac.swap into matlab and found that 8 or so data points from them > were not the same, i.e. the differences were first introduced in the > first sacswap conversion. I've confirmed that all the file headers > from the two files are the same, and NPTS of vel.sac is more than > 10,000, so the error rate is pretty small. > > The sac file I used is here, in case you want to try it: > http://maxwell.phys.uconn.edu/~icrazy/sac/vel.sac > > To narrow the problem down, I extracted one of the numbers that was > changed after doing sacswap conversions and made a small program to > repeat the problem. The program is here: > http://maxwell.phys.uconn.edu/~icrazy/sac/endian1.c > > I found that the culprit is this function float_swap() below, which > according to [1] is not a good way to implement the byte swapping > feature. > > float float_swap(char cbuf[]) { > union { > char cval[4]; > float fval; > } f_union; > > f_union.cval[3] = cbuf[0]; > f_union.cval[2] = cbuf[1]; > f_union.cval[1] = cbuf[2]; > f_union.cval[0] = cbuf[3]; > return(f_union.fval); > } > > This program endian1.c takes four bytes (0xff, 0xae, 0x47, and 0xc5) > as input and then swap the bytes (1st with 4th, 2nd with 3rd). The > correct output should be: > > ff ae 47 c5 --> c5 47 ae ff > > However, in my system (Pentium 4 CPU + GCC 4.2.3), using different > optimization parameters with gcc, I got different results: > > $ gcc endian1.c -o endian1 && ./endian1 > ff ae 47 c5 --> c5 47 ee ff > (wrong, "ae" was mysteriously changed to "ee") > > $ gcc -O2 endian1.c -o endian1 && ./endian1 > ff ae 47 c5 --> c5 47 ee ff > (wrong, "ae" was mysteriously changed to "ee") > > $ gcc -O3 endian1.c -o endian1 && ./endian1 > ff ae 47 c5 --> c5 47 ae ff > (correct) > > I also found that this behavior is system dependent. I think it might > only be dependent on the gcc version, but I'm listing the CPU as well. > > Systems which only give correct byte swapping results when using gcc > -O3: > - Intel(R) Pentium(R) 4 CPU 2.40GHz > gcc (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7) > - AMD Athlon(tm) MP 2800+ (2133MHz) > gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52) > > Systems which give correct byte swapping results not matter what: > - Intel(R) Xeon(R) CPU E5320 @ 1.86GHz > gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) > - Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz > gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) > > FIX: > > Ref [2] (page 9) gives some example byte swapping macros, which in my > opinion are equivalent to the byteswap() function in sac's source code > src/ucf/byteswap.c . I took this function and adapted my program > endian1.c to endian_byteswap.c below: > > http://maxwell.phys.uconn.edu/~icrazy/sac/endian_byteswap.c > > and everything works! > > I also did a complete update of sacswap: > > http://maxwell.phys.uconn.edu/~icrazy/sac/sacswap.c > > using byteswap(), while trying to keep as much original code intact as > possible. > > P.S. I don't know about the attachment policy of this mailing list. If > needed (maybe for archiving purposes), I can attach all the program > files mentioned here in a reply post. > > [1] > http://www.power.org/devcon/07/Session_Downloads/ > PADC07_McQuaid_PA_Wrobel_Heinz_20070910.pdf > > [2] Endianness White Paper > http://www.intel.com/design/intarch/papers/endian.pdf > > [3] pssac > http://www.eas.slu.edu/People/LZhu/downloads/pssac.tar > > Best regards, > > -- > Kuang He > Department of Physics > University of Connecticut > Storrs, CT 06269-3046 > > Tel: +1.860.486.4919 > Web: http://www.phys.uconn.edu/~he/ > _______________________________________________ > sac-dev mailing list > sac-dev at iris.washington.edu > http://www.iris.washington.edu/mailman/listinfo/sac-dev > George Helffrich george at geology.bristol.ac.uk From icrazy at gmail.com Sun Aug 31 07:45:21 2008 From: icrazy at gmail.com (Kuang He) Date: Sun, 31 Aug 2008 10:45:21 -0400 Subject: [sac-dev] sacswap's byte swapping bug In-Reply-To: <809f9750f3be254f6bf33b06bac4e24a@gly.bris.ac.uk> References: <809f9750f3be254f6bf33b06bac4e24a@gly.bris.ac.uk> Message-ID: On Sun, Aug 31, 2008 at 8:06 AM, George Helffrich wrote: > Dear Huang - > > Good work tracking this down. What it shows is that you are running > into compiler problems! Even though your reference [1] says that using > unions to swap is bad, there is actually nothing wrong with it. It might be Dear George, I also suspected that it could be a compiler problem. However, I tested endian1.c on another Pentium 4 machine with Visual C 6.0, and it always gave wrong answer. I know that maybe both gcc and VC6 have problems with union implementations, but chances are small. > inefficient on hardware, and, though [1] does not say why, there are two > possible problems that could arise: 1) loss of efficiency in swapping due > to sign extension; 2) alignment problems in the union. I suspect that the > problem you are having is due to botched optimization and/or sign extension > problems. > > Can you make the following test? Change your routine like so, and > run your tests: > > float float_swap(unsigned char cbuf[]) { > union { > unsigned char cval[4]; > float fval; > } f_union; > > f_union.cval[3] = cbuf[0]; > f_union.cval[2] = cbuf[1]; > f_union.cval[1] = cbuf[2]; > f_union.cval[0] = cbuf[3]; > return(f_union.fval); > } > > That rules out sign extension as a contributor to the problems. So I made the test, and nothing changed. $ diff -u endian1.c endian1-unsigned.c --- endian1.c 2008-08-30 19:10:29.000000000 -0400 +++ endian1-unsigned.c 2008-08-31 10:00:45.000000000 -0400 @@ -1,9 +1,9 @@ #include #include -float float_swap(char cbuf[]) { +float float_swap(unsigned char cbuf[]) { union { - char cval[4]; + unsigned char cval[4]; float fval; } f_union; @@ -17,7 +17,7 @@ int main() { float f; - char cbuf[4]; + unsigned char cbuf[4]; unsigned char *p; cbuf[0] = 0xff; $ gcc -Wall ./endian1-unsigned.c -o endian1-unsigned && ./endian1-unsigned ff ae 47 c5 --> c5 47 ee ff (wrong result) $ gcc -O3 -Wall ./endian1-unsigned.c -o endian1-unsigned && ./endian1-unsigned ff ae 47 c5 --> c5 47 ae ff (correct result) I also did a sizeof() of the above union, and it is 4 bytes on my system. Best regards, -- Kuang He Department of Physics University of Connecticut Storrs, CT 06269-3046 Tel: +1.860.486.4919 Web: http://www.phys.uconn.edu/~he/ From george at gly.bris.ac.uk Sun Aug 31 08:56:03 2008 From: george at gly.bris.ac.uk (George Helffrich) Date: Sun, 31 Aug 2008 16:56:03 +0100 Subject: [sac-dev] sacswap's byte swapping bug In-Reply-To: References: <809f9750f3be254f6bf33b06bac4e24a@gly.bris.ac.uk> Message-ID: Dear Huang - Thanks for making the check. Your test program is quite simple and straightforward. I was able to reproduce the problem on an Intel Mac mini (gcc 4.0.1). It turns out that the code to swap is fine. If you change the floats to ints, then the byte swapping works OK. Here's what the problem is, however: your number is an IEEE NaN! When it is returned by the function and stored, then the bit pattern changes. The reason the pattern changes is that your number is a signaling NaN. A processor is allowed to clear the flag when a floating point number is stored. That's the reason that the bit changes. Depending on whether a floating point store or a fixed point store or move is used, the result can differ depending on processor and/or code optimization level. So there's no problem with the byte swapping code. On 31 Aug 2008, at 15:45, Kuang He wrote: > On Sun, Aug 31, 2008 at 8:06 AM, George Helffrich > wrote: >> Dear Huang - >> >> Good work tracking this down. What it shows is that you are >> running >> into compiler problems! Even though your reference [1] says that >> using >> unions to swap is bad, there is actually nothing wrong with it. It >> might be > > Dear George, > > I also suspected that it could be a compiler problem. However, I > tested endian1.c on another Pentium 4 machine with Visual C 6.0, and > it always gave wrong answer. I know that maybe both gcc and VC6 have > problems with union implementations, but chances are small. > >> inefficient on hardware, and, though [1] does not say why, there are >> two >> possible problems that could arise: 1) loss of efficiency in >> swapping due >> to sign extension; 2) alignment problems in the union. I suspect >> that the >> problem you are having is due to botched optimization and/or sign >> extension >> problems. >> >> Can you make the following test? Change your routine like so, >> and >> run your tests: >> >> float float_swap(unsigned char cbuf[]) { >> union { >> unsigned char cval[4]; >> float fval; >> } f_union; >> >> f_union.cval[3] = cbuf[0]; >> f_union.cval[2] = cbuf[1]; >> f_union.cval[1] = cbuf[2]; >> f_union.cval[0] = cbuf[3]; >> return(f_union.fval); >> } >> >> That rules out sign extension as a contributor to the problems. > > So I made the test, and nothing changed. > > $ diff -u endian1.c endian1-unsigned.c > --- endian1.c 2008-08-30 19:10:29.000000000 -0400 > +++ endian1-unsigned.c 2008-08-31 10:00:45.000000000 -0400 > @@ -1,9 +1,9 @@ > #include > #include > > -float float_swap(char cbuf[]) { > +float float_swap(unsigned char cbuf[]) { > union { > - char cval[4]; > + unsigned char cval[4]; > float fval; > } f_union; > > @@ -17,7 +17,7 @@ > int main() > { > float f; > - char cbuf[4]; > + unsigned char cbuf[4]; > unsigned char *p; > > cbuf[0] = 0xff; > > $ gcc -Wall ./endian1-unsigned.c -o endian1-unsigned && > ./endian1-unsigned > ff ae 47 c5 --> c5 47 ee ff > (wrong result) > > $ gcc -O3 -Wall ./endian1-unsigned.c -o endian1-unsigned && > ./endian1-unsigned > ff ae 47 c5 --> c5 47 ae ff > (correct result) > > I also did a sizeof() of the above union, and it is 4 bytes on my > system. > > Best regards, > > -- > Kuang He > Department of Physics > University of Connecticut > Storrs, CT 06269-3046 > > Tel: +1.860.486.4919 > Web: http://www.phys.uconn.edu/~he/ > _______________________________________________ > sac-dev mailing list > sac-dev at iris.washington.edu > http://www.iris.washington.edu/mailman/listinfo/sac-dev > George Helffrich george at geology.bristol.ac.uk From icrazy at gmail.com Sun Aug 31 10:41:24 2008 From: icrazy at gmail.com (Kuang He) Date: Sun, 31 Aug 2008 13:41:24 -0400 Subject: [sac-dev] sacswap's byte swapping bug In-Reply-To: References: <809f9750f3be254f6bf33b06bac4e24a@gly.bris.ac.uk> Message-ID: Dear George, Thanks for the enlightenment, I did not think of NaN before and was not aware of fact that the number NaN is allowed to be changed by the processor. Just to clarify, the my input number specified in endian1.c (listed below) under little-endian systems is not NaN, and it actually represents the number 0xC547AEFF. After byte swapping, this number will become 0xFFAE7A54, and appear to be signalling NaN to a little endian system. cbuf[0] = 0xff; cbuf[1] = 0xae; cbuf[2] = 0x47; cbuf[3] = 0xc5; Still, I think the original sacswap program can lead to a very undesirable effect, that is, after using it, the values from sac files may change. A simple way to demonstrate it is again to use the sac file I've provided: http://maxwell.phys.uconn.edu/~icrazy/sac/vel.sac First, we byte swap it: $ sacswap vel.sac sacswap: Writing vel.sac.swap with npts = 14054 native => non-native Then we convert the sac files to ascii files in order to make a text comparison: $ cat sac2ascii CONVERT FROM SAC vel.sac TO ALPHA vel.sac.txt CONVERT FROM SAC vel.sac.swap TO ALPHA vel.sac.swap.txt quit $ sac sac2ascii If we have used the original version of sacswap in the above byte swapping step, files vel.sac.txt and vel.sac.swap.txt would differ on my system (and on some other systems as well, but not all of them). $ diff -u vel.sac.txt vel.sac.swap.txt --- vel.sac.txt 2008-08-31 13:06:19.000000000 -0400 +++ vel.sac.swap.txt 2008-08-31 13:06:19.000000000 -0400 @@ -9,7 +9,7 @@ -12345 -12345 -12345 -12345 -12345 -12345 -12345 -12345 -12345 -12345 2848.977 304.7868 107.2769 25.62435 -12345 - -12345 -711.2866 0 0 -12345 + -12345 -711.2802 0 0 -12345 -12345 -12345 -12345 -12345 -12345 -12345 -12345 -12345 -12345 -12345 2008 133 6 30 54 @@ -1767,7 +1767,7 @@ -442.2065 -627.9093 -794.7451 -949.0807 -1086.678 -1217.899 -1345.818 -1483.826 -1629.344 -1787.957 -1955.329 -2136.53 -2321.403 -2509.167 -2689.645 - -2865.094 -3031.295 -3194.937 -3348.609 -3491.529 + -2865.094 -3031.295 -3198.937 -3348.609 -3491.529 -3615.532 -3724.585 -3811.694 -3886.186 -3950.19 -4018.139 -4088.735 -4161.889 -4223.314 -4275.219 -4315.033 -4355.229 -4397.055 -4444.013 -4481.068 @@ -1781,7 +1781,7 @@ -2900.382 -2733.463 -2557.34 -2380.768 -2199.411 -2018.65 -1835.886 -1663.406 -1504.461 -1375.213 -1279.549 -1231.151 -1227.254 -1269.1 -1342.068 - -1444.266 -1564.187 -1702.966 -1845.73 -1985.935 + -1444.266 -1566.187 -1702.966 -1845.73 -1985.935 -2108.854 -2215.427 -2302.428 -2382.116 -2451.169 -2512.34 -2555.602 -2583.942 -2590.573 -2583.463 -2562.272 -2535.475 -2494.733 -2445.612 -2387.887 @@ -1982,7 +1982,7 @@ -80449.42 -80341.84 -80221.92 -80085.12 -79943.02 -79792.08 -79642 -79486.1 -79331.84 -79174.38 -79021.8 -78862.32 -78700.26 -78527.55 -78352.4 - -78166.28 -77976.15 -77775.91 -77575.99 -77370.91 + -78166.28 -77976.15 -77775.91 -77703.99 -77370.91 -77165.37 -76944.98 -76712.09 -76458.49 -76193.1 -75908.02 -75610.84 -75292.95 -74960.52 -74601.54 -74220.37 -73809.9 -73374.59 -72901.31 -72395.22 @@ -2074,8 +2074,8 @@ 49460.56 49606.18 49739.93 49870.56 49992.12 50113.37 50231.81 50355.56 50468.37 50569.93 50647.43 50709.93 50754 50788.06 50804 - 50807.43 50786.81 50750.87 50695.25 50621.5 - 50518.37 50394.31 50243.06 50076.5 49892.75 + 50807.43 50786.81 50750.87 50695.25 50685.5 + 50518.37 50394.31 50243.06 50140.5 49892.75 49698.37 49488.37 49277.12 49060.56 48854.31 48656.5 48478.37 48318.68 48187.75 48076.5 47991.19 47920.56 47864 47810.25 47769.94 @@ -2084,12 +2084,12 @@ 47841.5 47901.19 47964.31 48041.5 48120.25 48206.5 48290.25 48380.87 48463.53 48546.96 48624.78 48707.28 48787.9 48879.15 48974 - 49080.4 49190.4 49311.5 49432.28 49558.37 + 49080.4 49190.4 49375.5 49432.28 49558.37 49682.12 49812.43 49940.4 50073.37 50203.21 50335.71 50457.28 50570.56 50662.43 50738.21 50786.5 50814.46 50813.06 50793.68 50747.2 50680.64 50579.23 50446.34 50269.93 50061.26 - 49816.5 49547.98 49246.26 48914.78 48539.07 + 49880.5 49547.98 49246.26 48914.78 48539.07 48130.72 47686.66 47220.05 46724.55 46213.8 45684.16 45148.57 44598.93 44042.48 43467.42 42878.2 42263.68 41629.63 40966.04 40282.76 @@ -2655,7 +2655,7 @@ -14297.84 -14457.91 -14600.05 -14759.44 -14899.54 -15057.7 -15197.83 -15354.5 -15491.26 -15648.15 -15789.54 -15950.98 -16095.28 -16259.3 -16402.67 - -16560.03 -16695.9 -16844.77 -16969.5 -17109.72 + -16560.03 -16695.9 -16844.77 -17001.5 -17109.72 -17225.52 -17350.89 -17453.12 -17570.5 -17664.18 -17769.41 -17850.37 -17944.39 -18012.98 -18092.75 -18143.65 -18201.23 -18230.68 -18269.39 -18278.61 One can of course argue that the difference in values of the changed data are not great, but it is certainly not desirable for users to have their data changed without telling them. However, if we have used the updated version of sacswap I've provided before in the above byte swapping step, files vel.sac.txt and vel.sac.swap.txt will always be the same (at least it is like this according to my own tests). Therefore, to eliminate the possible unwitting data changing effects, I think we should update the original sacswap program to one that uses byte_swap(). One can still argue that it might be easier to just add a "-O3" parameter when compiling the original sacswap program, which seems to eliminate the problem withing touching the source code, but this trick can not be guaranteed to work on all the platforms. Best regards, -- Kuang He Department of Physics University of Connecticut Storrs, CT 06269-3046 Tel: +1.860.486.4919 Web: http://www.phys.uconn.edu/~he/ On Sun, Aug 31, 2008 at 11:56 AM, George Helffrich wrote: > Dear Huang - > > Thanks for making the check. Your test program is quite simple and > straightforward. > > I was able to reproduce the problem on an Intel Mac mini (gcc 4.0.1). > It turns out that the code to swap is fine. If you change the floats to > ints, then the byte swapping works OK. > > Here's what the problem is, however: your number is an IEEE NaN! > When it is returned by the function and stored, then the bit pattern > changes. The reason the pattern changes is that your number is a signaling > NaN. A processor is allowed to clear the flag when a floating point number > is stored. That's the reason that the bit changes. Depending on whether a > floating point store or a fixed point store or move is used, the result can > differ depending on processor and/or code optimization level. > > So there's no problem with the byte swapping code. From icrazy at gmail.com Sun Aug 31 11:07:18 2008 From: icrazy at gmail.com (Kuang He) Date: Sun, 31 Aug 2008 14:07:18 -0400 Subject: [sac-dev] Possible improvements of bin/sacinit.sh Message-ID: Hi, I just found some little things that could be improved for bin/sacinit.sh . > $ cat bin/sacinit.sh > [snipped] > # Initialize the SAC Enviornment > export SACHOME=/usr/local/jas/sac "export SACHOME=/usr/local/sac" may be a better general setting here. > [snipped] > > # SAC_USE_DATABASE > # Undefined or 1 -- Use SeisMgr Database ( Default ) > # 0 -- Do Not Use SeisMgr Database > # The SeisMgr database attempts to keep the CSS data fields in line > # with those in the SAC header. If you are handling CSS data it > # would be wise to keep the database on. Using the SeisMgr database > # currently can be very slow due when handling hundreds of files > # and turning it off should show a dramatic speed increase. > # Default: SAC_USE_DATABASE 1 Actually, I believe the default value of SAC_USE_DATABASE has been changed to 0. > [snipped] Best regards, -- Kuang He Department of Physics University of Connecticut Storrs, CT 06269-3046 Tel: +1.860.486.4919 Web: http://www.phys.uconn.edu/~he/ From george at gly.bris.ac.uk Sun Aug 31 11:07:21 2008 From: george at gly.bris.ac.uk (George Helffrich) Date: Sun, 31 Aug 2008 19:07:21 +0100 Subject: [sac-dev] sacswap's byte swapping bug In-Reply-To: References: <809f9750f3be254f6bf33b06bac4e24a@gly.bris.ac.uk> Message-ID: <2decde50ed9fc6a0f9419fcf415cb80a@gly.bris.ac.uk> Dear Kuang - If you demand that type of compatibility, then the straightforward way to change sacswap would be to change the type of float_swap (and the type in the union) to int, and store the swapped value as an int (with a suitable cast). That would ensure against any bit changes due to NaN behavior on intermediate machine architectures. Can you provide a diff to sacswap.c do that? On 31 Aug 2008, at 18:41, Kuang He wrote: > Dear George, > > Thanks for the enlightenment, I did not think of NaN before and was > not aware of fact that the number NaN is allowed to be changed by the > processor. > > Just to clarify, the my input number specified in endian1.c (listed > below) under little-endian systems is not NaN, and it actually > represents the number 0xC547AEFF. After byte swapping, this number > will become 0xFFAE7A54, and appear to be signalling NaN to a little > endian system. > > cbuf[0] = 0xff; > cbuf[1] = 0xae; > cbuf[2] = 0x47; > cbuf[3] = 0xc5; > > Still, I think the original sacswap program can lead to a very > undesirable effect, that is, after using it, the values from sac files > may change. A simple way to demonstrate it is again to use the sac > file I've provided: > > http://maxwell.phys.uconn.edu/~icrazy/sac/vel.sac > > First, we byte swap it: > > $ sacswap vel.sac > sacswap: Writing vel.sac.swap with npts = 14054 native => non-native > > Then we convert the sac files to ascii files in order to make a text > comparison: > > $ cat sac2ascii > CONVERT FROM SAC vel.sac TO ALPHA vel.sac.txt > CONVERT FROM SAC vel.sac.swap TO ALPHA vel.sac.swap.txt > quit > > $ sac sac2ascii > > If we have used the original version of sacswap in the above byte > swapping step, files vel.sac.txt and vel.sac.swap.txt would differ on > my system (and on some other systems as well, but not all of them). > > $ diff -u vel.sac.txt vel.sac.swap.txt > --- vel.sac.txt 2008-08-31 13:06:19.000000000 -0400 > +++ vel.sac.swap.txt 2008-08-31 13:06:19.000000000 -0400 > @@ -9,7 +9,7 @@ > -12345 -12345 -12345 -12345 > -12345 > -12345 -12345 -12345 -12345 > -12345 > 2848.977 304.7868 107.2769 25.62435 > -12345 > - -12345 -711.2866 0 0 > -12345 > + -12345 -711.2802 0 0 > -12345 > -12345 -12345 -12345 -12345 > -12345 > -12345 -12345 -12345 -12345 > -12345 > 2008 133 6 30 54 > @@ -1767,7 +1767,7 @@ > -442.2065 -627.9093 -794.7451 -949.0807 > -1086.678 > -1217.899 -1345.818 -1483.826 -1629.344 > -1787.957 > -1955.329 -2136.53 -2321.403 -2509.167 > -2689.645 > - -2865.094 -3031.295 -3194.937 -3348.609 > -3491.529 > + -2865.094 -3031.295 -3198.937 -3348.609 > -3491.529 > -3615.532 -3724.585 -3811.694 -3886.186 > -3950.19 > -4018.139 -4088.735 -4161.889 -4223.314 > -4275.219 > -4315.033 -4355.229 -4397.055 -4444.013 > -4481.068 > @@ -1781,7 +1781,7 @@ > -2900.382 -2733.463 -2557.34 -2380.768 > -2199.411 > -2018.65 -1835.886 -1663.406 -1504.461 > -1375.213 > -1279.549 -1231.151 -1227.254 -1269.1 > -1342.068 > - -1444.266 -1564.187 -1702.966 -1845.73 > -1985.935 > + -1444.266 -1566.187 -1702.966 -1845.73 > -1985.935 > -2108.854 -2215.427 -2302.428 -2382.116 > -2451.169 > -2512.34 -2555.602 -2583.942 -2590.573 > -2583.463 > -2562.272 -2535.475 -2494.733 -2445.612 > -2387.887 > @@ -1982,7 +1982,7 @@ > -80449.42 -80341.84 -80221.92 -80085.12 > -79943.02 > -79792.08 -79642 -79486.1 -79331.84 > -79174.38 > -79021.8 -78862.32 -78700.26 -78527.55 > -78352.4 > - -78166.28 -77976.15 -77775.91 -77575.99 > -77370.91 > + -78166.28 -77976.15 -77775.91 -77703.99 > -77370.91 > -77165.37 -76944.98 -76712.09 -76458.49 > -76193.1 > -75908.02 -75610.84 -75292.95 -74960.52 > -74601.54 > -74220.37 -73809.9 -73374.59 -72901.31 > -72395.22 > @@ -2074,8 +2074,8 @@ > 49460.56 49606.18 49739.93 49870.56 > 49992.12 > 50113.37 50231.81 50355.56 50468.37 > 50569.93 > 50647.43 50709.93 50754 50788.06 > 50804 > - 50807.43 50786.81 50750.87 50695.25 > 50621.5 > - 50518.37 50394.31 50243.06 50076.5 > 49892.75 > + 50807.43 50786.81 50750.87 50695.25 > 50685.5 > + 50518.37 50394.31 50243.06 50140.5 > 49892.75 > 49698.37 49488.37 49277.12 49060.56 > 48854.31 > 48656.5 48478.37 48318.68 48187.75 > 48076.5 > 47991.19 47920.56 47864 47810.25 > 47769.94 > @@ -2084,12 +2084,12 @@ > 47841.5 47901.19 47964.31 48041.5 > 48120.25 > 48206.5 48290.25 48380.87 48463.53 > 48546.96 > 48624.78 48707.28 48787.9 48879.15 > 48974 > - 49080.4 49190.4 49311.5 49432.28 > 49558.37 > + 49080.4 49190.4 49375.5 49432.28 > 49558.37 > 49682.12 49812.43 49940.4 50073.37 > 50203.21 > 50335.71 50457.28 50570.56 50662.43 > 50738.21 > 50786.5 50814.46 50813.06 50793.68 > 50747.2 > 50680.64 50579.23 50446.34 50269.93 > 50061.26 > - 49816.5 49547.98 49246.26 48914.78 > 48539.07 > + 49880.5 49547.98 49246.26 48914.78 > 48539.07 > 48130.72 47686.66 47220.05 46724.55 > 46213.8 > 45684.16 45148.57 44598.93 44042.48 > 43467.42 > 42878.2 42263.68 41629.63 40966.04 > 40282.76 > @@ -2655,7 +2655,7 @@ > -14297.84 -14457.91 -14600.05 -14759.44 > -14899.54 > -15057.7 -15197.83 -15354.5 -15491.26 > -15648.15 > -15789.54 -15950.98 -16095.28 -16259.3 > -16402.67 > - -16560.03 -16695.9 -16844.77 -16969.5 > -17109.72 > + -16560.03 -16695.9 -16844.77 -17001.5 > -17109.72 > -17225.52 -17350.89 -17453.12 -17570.5 > -17664.18 > -17769.41 -17850.37 -17944.39 -18012.98 > -18092.75 > -18143.65 -18201.23 -18230.68 -18269.39 > -18278.61 > > One can of course argue that the difference in values of the changed > data are not great, but it is certainly not desirable for users to > have their data changed without telling them. However, if we have used > the updated version of sacswap I've provided before in the above byte > swapping step, files vel.sac.txt and vel.sac.swap.txt will always be > the same (at least it is like this according to my own tests). > > Therefore, to eliminate the possible unwitting data changing effects, > I think we should update the original sacswap program to one that uses > byte_swap(). One can still argue that it might be easier to just add a > "-O3" parameter when compiling the original sacswap program, which > seems to eliminate the problem withing touching the source code, but > this trick can not be guaranteed to work on all the platforms. > > Best regards, > > -- > Kuang He > Department of Physics > University of Connecticut > Storrs, CT 06269-3046 > > Tel: +1.860.486.4919 > Web: http://www.phys.uconn.edu/~he/ > > On Sun, Aug 31, 2008 at 11:56 AM, George Helffrich > wrote: >> Dear Huang - >> >> Thanks for making the check. Your test program is quite >> simple and >> straightforward. >> >> I was able to reproduce the problem on an Intel Mac mini (gcc >> 4.0.1). >> It turns out that the code to swap is fine. If you change the >> floats to >> ints, then the byte swapping works OK. >> >> Here's what the problem is, however: your number is an IEEE >> NaN! >> When it is returned by the function and stored, then the bit pattern >> changes. The reason the pattern changes is that your number is a >> signaling >> NaN. A processor is allowed to clear the flag when a floating point >> number >> is stored. That's the reason that the bit changes. Depending on >> whether a >> floating point store or a fixed point store or move is used, the >> result can >> differ depending on processor and/or code optimization level. >> >> So there's no problem with the byte swapping code. > _______________________________________________ > sac-dev mailing list > sac-dev at iris.washington.edu > http://www.iris.washington.edu/mailman/listinfo/sac-dev > George Helffrich george at geology.bristol.ac.uk From icrazy at gmail.com Sun Aug 31 11:25:26 2008 From: icrazy at gmail.com (Kuang He) Date: Sun, 31 Aug 2008 14:25:26 -0400 Subject: [sac-dev] sacswap's byte swapping bug In-Reply-To: <2decde50ed9fc6a0f9419fcf415cb80a@gly.bris.ac.uk> References: <809f9750f3be254f6bf33b06bac4e24a@gly.bris.ac.uk> <2decde50ed9fc6a0f9419fcf415cb80a@gly.bris.ac.uk> Message-ID: Dear George, Hmm, while I still think the previous way of doing it using byte_swap() is correct, using your way certainly would generate a much smaller diff file. The new diff file is attached, and also can be found at: http://maxwell.phys.uconn.edu/~icrazy/sac/sacswap.c.diff The original diff file using byte_swap() is also put here for comparison: http://maxwell.phys.uconn.edu/~icrazy/sac/sacswap.c.diff2 Best regards, -- Kuang He Department of Physics University of Connecticut Storrs, CT 06269-3046 Tel: +1.860.486.4919 Web: http://www.phys.uconn.edu/~he/ On Sun, Aug 31, 2008 at 2:07 PM, George Helffrich wrote: > Dear Kuang - > > If you demand that type of compatibility, then the straightforward > way to change sacswap would be to change the type of float_swap (and the > type in the union) to int, and store the swapped value as an int (with a > suitable cast). That would ensure against any bit changes due to NaN > behavior on intermediate machine architectures. Can you provide a diff to > sacswap.c do that? > > On 31 Aug 2008, at 18:41, Kuang He wrote: > >> Dear George, >> >> Thanks for the enlightenment, I did not think of NaN before and was >> not aware of fact that the number NaN is allowed to be changed by the >> processor. >> >> Just to clarify, the my input number specified in endian1.c (listed >> below) under little-endian systems is not NaN, and it actually >> represents the number 0xC547AEFF. After byte swapping, this number >> will become 0xFFAE7A54, and appear to be signalling NaN to a little >> endian system. >> >> cbuf[0] = 0xff; >> cbuf[1] = 0xae; >> cbuf[2] = 0x47; >> cbuf[3] = 0xc5; >> >> Still, I think the original sacswap program can lead to a very >> undesirable effect, that is, after using it, the values from sac files >> may change. A simple way to demonstrate it is again to use the sac >> file I've provided: >> [ ... snipped ...] >> Therefore, to eliminate the possible unwitting data changing effects, >> I think we should update the original sacswap program to one that uses >> byte_swap(). One can still argue that it might be easier to just add a >> "-O3" parameter when compiling the original sacswap program, which >> seems to eliminate the problem withing touching the source code, but >> this trick can not be guaranteed to work on all the platforms. >> >> Best regards, >> >> -- >> Kuang He -------------- next part -------------- A non-text attachment was scrubbed... Name: sacswap.c.diff Type: application/octet-stream Size: 1276 bytes Desc: not available URL: From snoke at vt.edu Sun Aug 31 13:29:48 2008 From: snoke at vt.edu (Arthur Snoke) Date: Sun, 31 Aug 2008 16:29:48 -0400 Subject: [sac-dev] Possible improvements of bin/sacinit.sh In-Reply-To: References: Message-ID: <48BAFF3C.8000905@vt.edu> Thanks for your feedback. See comments interspersed. Kuang He wrote: > Hi, > > I just found some little things that could be improved for bin/sacinit.sh . > >> $ cat bin/sacinit.sh >> [snipped] >> # Initialize the SAC Enviornment >> export SACHOME=/usr/local/jas/sac > > "export SACHOME=/usr/local/sac" may be a better general setting here. > This was carelessness on my part in preparing the binary distributions. In the source distribution, the first few lines in sacinit.sh.in are # After selecting the environment options, the script can be run as a # stand-alone initialization, but to preserve the environment one should # put them directly in the session start-up scripts. # # Initialize the SAC Environment export SACHOME=__SAC_PREFIX__ export PATH=${PATH}:${SACHOME}/bin export SACAUX=${SACHOME}/aux Accordingly, SACHOME gets set to whatever the PREFIX was set to be in the ./configure stage. On most machines on which I prepare the binary distributions, I do not have root privileges and have the system manager set up a directory /usr/local/jas for which I am the owner. In hindsight, and in the future, I will edit the sacinit.sh and sacinit.csh before making the final .tar.gz file -- taking out /jas. >> [snipped] >> >> # SAC_USE_DATABASE >> # Undefined or 1 -- Use SeisMgr Database ( Default ) >> # 0 -- Do Not Use SeisMgr Database >> # The SeisMgr database attempts to keep the CSS data fields in line >> # with those in the SAC header. If you are handling CSS data it >> # would be wise to keep the database on. Using the SeisMgr database >> # currently can be very slow due when handling hundreds of files >> # and turning it off should show a dramatic speed increase. >> # Default: SAC_USE_DATABASE 1 > > Actually, I believe the default value of SAC_USE_DATABASE has been changed to 0. Earlier we had 1 as the default but changed to 0 and forgot to change these files. That has been done now.