I currently use either Easy Subtitles Synchronizer or Subtitle Edit to fix this problem depending upon my mood. ![]() Set it, forget it, and add it to the “Add pair to OCR replace list” on the fly as it fails onward.ĭo not be surprised if your subtitles are not properly aligned. If you do that you will be playing whack-a-mole forever. I do not recommend updating an I to be an l or an l to be an I. If you see this then you can simply click on the line of text that has the problem in the main window, navigate to the specific character, and update it accordingly. In some cases Subtitle Edit will fail to show a letter or will detect it incorrectly. I use the website frequently because what Subtitle Edit provides in it’s interface is very limited. I’ve added a lot of characters and words to the database from multiple TV series and movies as some of them use unique fonts and have both unique words and names in them. Don’t be surprised if a few more words pop up on the second or third pass. Run again to catch some more and then run it until there is nothing left to fix. error% value to 1.0 percent for higher accuracy. This can also fix words that are too close together.īest practice is to have Subtitle Edit just rack up the words it doesn’t know so you can bang the majority of the duplicates out after the first full run. You will need to use this to fix things like. This is for adding words that are case insensitive that are not in it’s default dictionary.Ī lower case L looks the same as an upper case “I” in most sans-serif fonts. This is for things like Hogwarts or WebRTC. When Subtitle Edit comes across a word it doesn’t know it will ask you to do one of a few things.Ī) Add to names/noise list (case sensitive) This is most common with italics, but depending on the font it can also affect normal letters and numbers. ![]() Sometimes it will detect “rt” as a single block so you have to add “rt” as a letter. You can expand the block to fit quotes and the like, but you cannot shrink it from what it originally detects. When it comes across a “block” of information it asks you what it is and if it is italic or not. The process to teach it what each letter looks like, typically multiple times for the same exact letter early on, is onerous and will crush your soul. It is also the recommended option for Subtitle Edit.īinary Image Compare has to be trained to look at the many different fonts that you can come across right down to the letter, number, punctuation, and symbol level. This is what I use and what I think that Hulu also uses. ![]() Blu-ray MPEG-TS and DVD’s MPEG-PS containers use images for playback of video closed captioning. This does a decent job but I no longer use it as it has problems with some fonts and italics.Ģ) Binary Image Compare. I t c a n a l s o b e a b a d t h in g.ġ) Tesseract. If you modify it to have letters/blocks further apart you will merge a lot of words together. For example if you modify it to look for letters/blocks closer together then you will likely have a lot of individual characters instead of words. You can adjust Subtitle Edit to look for letters closer together or further apart based on the number of pixels you tell it are in a space, but this is a global setting for each input file and cannot be modified specifically to adjust for italics because everything is in an image based format, specifically a SUP file. Subtitle Edit likes DVD subtitles to be around 6-8 pixels apart because the letters are lower resolution. Ten and eleven pixels work well for most Blu-ray content. Why it doesn’t happen as often across the entire sentence is beyond me at this time, then again I have a massive replace list. ![]() More often than not I will see a sentence that is in italics that will have two or more words touching each other near the middle of the sentence because the distance between the letters in pixels is much smaller than normal letters. I’ve been watching a few TV shows on Hulu and believe that at least Cloak & Dagger as well as Stichers had their subtitles ripped from an m2ts file using either HdBr Stream Extractor v9 or MeGUI, likely from Blu-ray, and converted using Subtitle Edit. I will be using the terms “closed captions” and “subtitles” interchangeably in this post because it isn’t always possible to know if the source binary image based SUP file you have has either closed captions, which include both descriptive text and dialog, or subtitles, which contain only dialog, in them.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |