You can use the hd command to see what the exact hex value is.<br><br>To remove all non printable characters from a file, you can do:<br><br><pre>tr -cd '\11\12\15\40-\176' < inputfile.txt > outputfile.txt<br>
</pre><br>This uses octal, where \11 is tab, \12 is linefeed, and \15 is carriage return. \40 to \176 is the normal ASCII characters.<br><br>You can adjust the above for more (or less) characters by looking up the values in the output of the 'ascii' command.<br>
<br><div class="gmail_quote">On Thu, Dec 10, 2009 at 2:47 PM, Insurance Squared Inc. <span dir="ltr"><<a href="mailto:gcooke@insurancesquared.com">gcooke@insurancesquared.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
I've got some 'text' files created by an OCR program. Some of the text files have the occassional weird character in them that is causing issues when I import. How can I get rid of them from the command prompt?<br>
<br>
When I 'nano' one file, it shows a question mark with a white background. When I view the file with vi, not that I use vi :) , I see <97> where the character is - probably the decimal representation.<br>
<br>
I tried "perl - p -i -e 's/?//g' *" and "perl -p -i -e 's/\<97\>/g' *" as a search and replace but neither removed the character from the file. Grep doesn't find the characters either. <br>
g<br>
<br>
_______________________________________________<br>
<a href="http://kwlug-disc_kwlug.org" target="_blank">kwlug-disc_kwlug.org</a> mailing list<br>
<a href="http://kwlug-disc_kwlug.org" target="_blank">kwlug-disc_kwlug.org</a>@<a href="http://kwlug.org" target="_blank">kwlug.org</a><br>
<a href="http://astoria.ccjclearline.com/mailman/listinfo/kwlug-disc_kwlug.org" target="_blank">http://astoria.ccjclearline.com/mailman/listinfo/kwlug-disc_kwlug.org</a><br>
</blockquote></div><br><br clear="all"><br>-- <br>Khalid M. Baheyeldin<br><a href="http://2bits.com">2bits.com</a>, Inc.<br><a href="http://2bits.com">http://2bits.com</a><br>Drupal optimization, development, customization and consulting.<br>
Simplicity is prerequisite for reliability. -- Edsger W.Dijkstra<br>Simplicity is the ultimate sophistication. -- Leonardo da Vinci<br>