Differences
This shows you the differences between two versions of the page.
Both sides previous revision
Previous revision
Next revision
|
Previous revision
|
ingesting:pdfw [2012/11/26 18:26] giancarlo |
ingesting:pdfw [2012/11/26 18:34] giancarlo |
* **On Back-end server** with ImageMagick and pdftk | * **On Back-end server** with ImageMagick and pdftk |
| |
| Call the script //pdfatiff.sh// with directory of book directories as parameter: |
| <WRAP prewrap center> |
| <code> |
| #./pdfatiff.sh /srv/data/bookforingest |
| </code> |
| </WRAP> |
| |
| <WRAP prewrap center> |
| <code bash pdfatiff.sh> |
| #!/bin/bash |
| |
| bdir=$1 |
| |
| SAVEIFS=$IFS |
| IFS=$(echo -en "\n\b") |
| for bookdir in $(find "$bdir/"openbess* -maxdepth 0 -type d ); |
| do |
| |
| echo "$bookdir" |
| n=0 |
| SAVEIFS=$IFS |
| IFS=$(echo -en "\n\b") |
| for nfile in $(find "$bookdir/"*.pdf -type f); |
| do |
| let "n += 1" |
| filepdf="$nfile" |
| done |
| if [ $n -gt 1 ] || [ $n -lt 1 ] |
| then |
| echo "ERROR file PDF non unico" |
| exit |
| fi |
| |
| mkdir "$bookdir""/pdfs" |
| cp "$filepdf" "$bookdir""/pdfs" |
| cd "$bookdir""/pdfs" |
| |
| pdftk "$filepdf" burst output pg-%04d.pdf |
| |
| n=0 |
| SAVEIFS=$IFS |
| IFS=$(echo -en "\n\b") |
| for nfile in $(find pg-*.pdf -type f); |
| do |
| let "n += 1" |
| sn=$(printf "%04d" $n) |
| filepdf="$nfile" |
| echo "$filepdf"" -> ""$sn.tif" |
| |
| pdftk "$filepdf" output "temp.pdf" |
| |
| # For PDF from image |
| # convert -density 150 "temp.pdf" "$sn.tif" |
| # For PDF from Word |
| convert -background white -flatten -density 600 -resize 1200 -border 0.5% -bordercolor LightGray "temp.pdf" "../""$sn.tif" |
| rm "temp.pdf" |
| done |
| cd ~/clineFC |
| rm -R "$bookdir""/pdfs" |
| done |
| exit |
| </code> |
| </WRAP> |
| The script creates a single tif file for every pdf page (i.e. 0001.tif, 0002.tif, ...) in every book directory. |
| \\ |
| \\ |
| Book in now ready for ingesting. |
| |