muckrights-sans-merde

 bonum fabula frat

### cleaning-an-html-mess-0-2 previous version: => cleaning-an-html-mess-in-30-minutes.html cleaning-an-html-mess-in-30-minutes *originally posted:* jun 2021 the original code took 30 minutes, this update took 22. ``` # cancel consecutive repeated text 0.2 # jun 2021 # license: creative commons cc0 1.0 (public domain) # http://creativecommons.org/publicdomain/zero/1.0/ # "compile" program from code editor: # fig50 ccrt.fig | tail #date #Thu Jun 3 06:30:06 UTC 2021 <- started coding #logic: #1. parsing on when it hits: ">" #2. parsing off when it hits: "</td" # date # Thu Jun 3 17:51:30 UTC 2021 0.2 started # - len / minus 4 / left / rtrim # - if leading "[", locate "]", len / minus location / right / ltrim # date # Thu Jun 3 18:13:52 UTC 2021 0.2 completed #3. compare to buffer #4. if non-match, print #5. copy to buffer #go! p arrstdin prevbuf "" buf "" fb 0 forin lin p forin each lin fb 1 now buf plus each swap now buf # parsing on when it hits: ">" ifequal each ">" buf "" # reset next # parsing off when it hits: "</td" bufright4 buf right 4 lcase ifequal bufright4 "</td" # len / minus 4 / left / rtrim buflen buf len minus 4 now buf left buflen rtrim swap now buf # if leading "[", locate "]", len / minus location / right / ltrim bufleft buf ltrim left 1 ifequal bufleft "[" bufloc instr buf "]" iftrue bufloc buflen buf len minus bufloc now buf right buflen ltrim swap now buf next next # compare to buffer ifequal buf prevbuf fb 0 else # if non-match, print fb 1 next # copy to buffer prevbuf buf break next next buf "" iftrue fb now lin print next next # date # Thu Jun 3 06:59:43 UTC 2021 # 30 minutes including debugging # (probably 20-25 without the comments) ``` i noticed that at least one person wasnt having their duplicates filtered, because there was one trailing space after half their posts, before the closing table data tag. the first addition to the logic should fix that. it wasnt necessary to "ignore" (filter) the tr-bridge user. checking for a leading "[" and removing [*] from the buffer lets the dupe filter fix those lines, but [*] is preserved because it prints the whole unedited line, not the buffer itself. => https://muckrights-sans-merde.neocities.org