FINE Trial team restore raw data file!

A.B.

Senior Member
Messages
3,780
Likes
23,086
I haven't deleted the post, A.B., but I could have made a mistake when calculating the checksum.

The file I downloaded in December and the one I downloaded yesterday are both 41KB in size.

How large is the file you downloaded?
I think the mistake is probably on my part.

The new file: http://journals.plos.org/plosone/article/asset?unique&id=info:doi/10.1371/journal.pone.0157199.s001

The old file, in .csv format, provided by Bob: http://forums.phoenixrising.me/inde...from-pace-trial-team.44705/page-4#post-731449

I don't have the software to open .dta files so I converted it to .csv with the Pandas data analysis library for the Python programming language.

The new file, after conversion, is 25 kB. The old is 19.3 kB.

In the new file, integer values are reported as floating point values (for example 266 becomes 266.0), and a floating point value has greater precision (3.400000095 becomes 3.4000000953674316).

What has probably happened is that Bob's file was created with export settings that truncate some values.
 

user9876

Senior Member
Messages
4,556
Likes
18,068
Somebody posted here (the post appears to have been deleted) that the new and old file were identical in the sense of having the same SHA-1 hash. This cannot be correct because they are not even of the same size. The new file contains more information. I'm currently trying to understand what has changed, and if there is anything suspicious. It is probably just nothing but verifying is better than trusting.
I thought the files had the same sha256 hash. I had two files
1 called journal.pone.0144623.s002.DTA which is 41953 bytes and dated 15th Dec 2015
and 1 I recently downloaded called journal.pone.0157199.s001.DTA same size but with a date of 2nd June 2016.

Both had a sha256 hash of cee92b6cb97264aba1ec651c4d2e21f9888930d53ab0134c12442e2513070552

Is there a different file?

Last time I used some python code to convert the format into something usable (I think it was using the Pandas library)

Instructions for R and python here
http://stackoverflow.com/questions/2536047/convert-stata-dta-file-to-csv-without-stata-software
 
Messages
435
Likes
368
I thought the files had the same sha256 hash. I had two files
1 called journal.pone.0144623.s002.DTA which is 41953 bytes and dated 15th Dec 2015
and 1 I recently downloaded called journal.pone.0157199.s001.DTA same size but with a date of 2nd June 2016.

Both had a sha256 hash of cee92b6cb97264aba1ec651c4d2e21f9888930d53ab0134c12442e2513070552
I get the same sha256 checksum as you from the STATA (.dta) file.

Is there a different file?
As far as I can tell there was only ever one file.
 

user9876

Senior Member
Messages
4,556
Likes
18,068
I cannot open the file. Anyone opened it and checked it against the original?
The file is in a format that opens for a particular type of statistics software. It can be converted into different formats which are more usable (without the particular statistics software package). Some general instructions are here
http://stackoverflow.com/questions/2536047/convert-stata-dta-file-to-csv-without-stata-software

I've attached my decoding of the original file as a comma separated list.
 

Attachments