Say you are scraping a feed or a page or parsing RSS feed, and the content that you are interested in say Title has some html elements in it (normally is an image) but you want to strip out the html elements and just display the text(making it more readable). In such cases the below snippet helps to strip out any HTML tags and replace them with a space comes to rescue. It basically uses NSScanner to scan to find the beginning of the html tag < till it find the end of the tag >, then it replaces the inner text with a space.
<!--break-->
Code Snippet
The following code snippet shows the main methods.
- +(NSString*)flattenHtml:(NSString*) html {
- NSScanner*theScanner;
- NSString*text =nil;
- theScanner =[NSScanner scannerWithString: html];
- while([theScanner isAtEnd]==NO){
- // find start of tag
- [theScanner scanUpToString:@"<" intoString:NULL];
- // find end of tag
- [theScanner scanUpToString:@">" intoString:&text];
- // replace the found tag with a space
- //(you can filter multi-spaces out later if you wish)
- html =[html stringByReplacingOccurrencesOfString:
- [NSString stringWithFormat:@"%@>", text]
- withString:@" "];
- }// while //
- return html;
- }