Wikileaks – a website that has been discussed here before – has performed a significant public service, by making nearly 7,000 reports prepared by the American Congressional Research Service publicly available. The documents are non-secret, and were paid for with a billion dollars of taxpayer money. Prior to the Wikileaks action, they were not available to the general public. The research service is meant to be a non-partisan office that provides factual information and analysis to inform political decision-making.
Topics covered in the reports include Israel’s relationship with the United States, abortion, China, weapons proliferation, and many others.
According to the linked page:
“Open government lawmakers such as Senators John McCain (R-Arizona) and Patrick J. Leahy (D-Vermont) have fought for years to make the reports public, with bills being introduced–and rejected–almost every year since 1998. The CRS, as a branch of Congress, is exempt from the Freedom of Information Act.”
I wonder who leaked these. Do you think they were in an easy-to-digitize format, or do you think the Wikileaks people had to make these PDFs out of paper documents?
Each Wikileaks page says that: “This report was obtained by Wikileaks staff from CRS computers accessible only from Congressional offices. ”
The reports are also in a slightly odd format. For me, at least, it is impossible to copy and paste cleanly from the PDF files.
I think its just that the text in the pdf posesses line breaks as a standard txt file does – that’s why you get this jumbly mess. There is a way around it – I know, you showed me it with your fancy word processor.
The line wrapping isn’t the problem.
On this system (Windows XP), the pasted text is rendered as empty ‘unknown character’ rectangles.
Actually, it only seems to happen with some of the files. It is probably some sort of character encoding issue.