Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

PDF to Text converter using ruby

DZone's Guide to

PDF to Text converter using ruby

·
Free Resource

 

#!/usr/bin/env ruby
require 'pdf/reader' # gem install pdf-reader

# credits to :
# 	https://github.com/yob/pdf-reader/blob/master/examples/text.rb
# usage example: 
# 	ruby pdf2txt.rb /path-to-file/file1.pdf [/path-to-file/file2.pdf..]
ARGV.each do |filename|

	PDF::Reader.open(filename) do |reader|

	  puts "Converting : #{filename}"
	  pageno = 0
	  txt = reader.pages.map do |page| 

	  	pageno += 1
	  	begin
	  		print "Converting Page #{pageno}/#{reader.page_count}\r"
	  		page.text 
	  	rescue
	  		puts "Page #{pageno}/#{reader.page_count} Failed to convert"
	  		''
	  	end

	  end # pages map

	  puts "\nWriting text to disk"
	  File.write filename+'.txt', txt.join("\n")

	end # reader

end # each
Topics:

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}