In various sciences one is often invested in processing voluminous data which follows a given format, with many examples following (hopefully) the same format. A typical data file might have a line of environment variables, followed by raw data, which repeats in columns and/or rows eg., 4,23,45,56,78 12,23,45,56,67,78,89,87,…,72 32,34,23,21,23,56,43,23,…,34 34,54,32,45,89,76,54,98,…,58 67,67,88,32,34,21,22,97,…,51 A typical section of C code which would use this might look like: float config[8]; scanf(“%f,%f,%f,%f,%f”,&config[1],&config[2],&config[3], &config[4],&config[5]); printf(“%f,%f,%f,%f,%fn”,config[1],config[2],config[3], config[4],config[5]); int RECORDS = (int) config[1]; int i; if ( RECORDS < 100){ int SIZE = sizeof(float)*RECORDS; float *deltaCSMean = malloc(SIZE); float *deltaCSSigma = malloc(SIZE); float *CQMean = malloc(SIZE); float *CQSigma = malloc(SIZE); float *etaMean = malloc(SIZE); float *etaSigma = malloc(SIZE); float *brdF2 = malloc(SIZE); float *brdF1 = malloc(SIZE); float *amplitude = malloc(SIZE); for (i=0; i< RECORDS; i++){ scanf(“%f,%f,%f,%f,%f,%f,%f,%f,%f”,&deltaCSMean[i], &deltaCSSigma[i],&CQMean[i],&CQSigma[i], &etaMean[i],&etaSigma[i], &brdF2[i],&brdF1[i],&litude[i]); printf(“%f,%f,%f,%f,%f,%f,%f,%f,%fn”,deltaCSMean[i], deltaCSSigma[i],CQMean[i],CQSigma[i], etaMean[i],etaSigma[i],brdF2[i],brdF1[i],amplitude[i]); } } Using redirect a simple invocation of executable foo using input foo1.txt and output foo2.txt would of course be: ./foo < foo1.txt > foo2.txt which is fine until one needs different data from the file. Supposing in this example data repeats along columns, then one can use awk and a pipe to reshape the data input, removing the need to edit and recompile the C source for different data subsets: gawk ‘BEGIN{FS=”,”; print “4,12,23,44,56″; getline;} {for(i=lowBnd;i foo2.txt where lowBnd and upBnd correspond to the data column limits in the input file.
Read the original:
UNIX redirect,pipes & reconfigurable data processing

