* terrapop_extract_1016.csv, and randomly draws 5000 households;
* out of it. In this case, a household is determined by a shared serial;
* number -- everyone in the household has the same serial number,;
* so 5000 households exports as approximately 11000 people.;
*seed is part of the randomization process, srs is simple random sample;
PROC IMPORT OUT= WORK.demos
DATAFILE= "filepath.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
proc surveyselect data= demos out=demos1 method=srs sampsize=5000 seed=377183 noprint;
samplingunit serial;
run;
*this is my 5000 person random sample;
options obs=max;
proc export data=work.demos1
outfile='Y:\demos1.csv'
dbms=csv
replace;run;
Here is a little snapshot of the data...
44135083 | 201896 | 6 | 840001 | 840001 | 0 | 2 | 999 | 12000 | 20 | 350 | 11 | 1 | 2.65E+08 | 4583402 | 99 | 110 | 1 | 2 | 2 | 100 | 5 | 8407 | 2 | 45 | 0 | 11900 | 2010 |
No comments:
Post a Comment