So this is what the text in the PDF file looks like. But it looks to me like an awful lot of the characters are correct, and I don't want to waste a lot of time doing that. I could laboriously decode the entire string, check to see what the Unicode code points are and then try and match those to the characters in the original file by placing them individually in a Word document, using Arial Unicode MS. ![]() I have no way to tell easily if this is correct. Eg character code 0x564 maps to the Unicode values 0x093e, 0x0901. The font embedded in the file (Arial Unicode MS) has an attached ToUnicode CMap which looks correct to me, however several of the single character codes map to multiple Unicode code points. Can you point to one specific glyph there that is incorrect after it is copied ? ![]() There's quite a lot of text, and while I can see that the fonts are different, its not clear to me that the individual glyphs are. ![]() For the Stack Overflow rules lawyers I know this isn't a complete answer, but its too long for a comment.Īs a non-speaker of the language, its rather difficult for me to identify differences here.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |