This is the Wikimedia adds/changes dump service. Please read the copyrights information. See Meta:Data dumps for documentation on the provided data formats.
Here's the big fat disclaimer.
This service is experimental. At any time it may not be working, for a day, a week or a month. It is not intended to replace the full XML dumps. We don't expect users to be able to construct full dumps of a given date from the incrementals and an older dump. We don't guarantee that the data included in these dumps is complete, or correct, or won't break your Xbox. In short: don't blame us (but do get on the email list and send mail: see xmldatadumps-l).
The data provided in these files is ''partial data''. To be precise:
What is in these files:
The stubs file consists of the metadata for revision texts of each page, where the revision texts were added within the time interval. These look just like the history stubs files you would find on our XML data dumps page, having the exact same format but only new revisions since the last adds/changes dump. This means you get metadata for articles, user pages, discussion pages, etc. If you want articles only, you will need to write a filter to grab just those entries.
The revs file consists of the metadata plus the wikitext for each new revision since the last adds/changes dump. This is in the same format as the pages-meta-history files you would find on our XML data dumps page. This means you get articles, user pages, discussion pages, etc. If you want articles only, you will need to write a filter to grab just those entries.
The md5sums.txt file contains the md5 hash of the stubs file and the revs file, so that downloaders can verify the integrity of the files after download.
The file maxrevid.txt contains the largest revision ID on the project from 12 hours before the start of the run or 12 hours before midnight on dates run after the fact.
The file status.txt, if it exists, will contain the value "done" in cases where the run is complete and was successful.
Return to our other datasets, the XML data dumps, or the main index.