All source listed below is under MIT license if no LICENSE file stating different is available.
Isspam
Fast as light evaluator for text files to summarize specific details about the text files.
This repository contains two versions of the same algorithm.
Versions:
- Rust (risspam) written by 12bitfloat.
- C (isspam) written by retoor.
Building
make build
Build isspam with memory check (requires valgrind to be installed):
make valgrind
Running
Using files as parameter
./(r)isspam ./spam/*.txt
./(r)isspam ./not_spam/*.txt
Using stdin
Useful for automation. Works only on the isspam version.
cat ./spam/example_spam1.txt | ./isspam
Example output
Output example made by isspam.
File: ./spam/example_spam3.txt
Capitalized words: 39
Sentences: 20
Words: 420
Numbers: 1
Forbidden words: 15
<0:recovery>
<1:techie>
<2:https>
<3:digital>
<4:hack>
<5://>
<6:com>
<7:@>
<8:crypto>
<9:bitcoin>
<10:whatsapp>
<11:cryptocurrency>
<12:stolen>
<13:contact>
<14:understanding>
Word count per sentence: 21
Memory usage: 1 MB, 6.460 (re)allocated, 4.222 unqiue free'd, 0 in use.
Valgrind status
Valgrind output for isspam version.
Rust variant thinks it's too cool for memory checks afterwards.
Date: 2024-11-30
==58062==
==58062== HEAP SUMMARY:
==58062== in use at exit: 0 bytes in 0 blocks
==58062== total heap usage: 6,490 allocs, 6,490 frees, 2,343,156 bytes allocated
==58062==
==58062== All heap blocks were freed -- no leaks are possible
==58062==
==58062== For lists of detected and suppressed errors, rerun with: -s
==58062== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
.gitea/workflows | |
12bitfloat_rust | |
not_spam | |
retoor_c | |
spam | |
.clang-format | |
.gitignore | |
bench.py | |
books.tar.gz | |
Makefile | |
README.md |