Text Processing

Efficiently process and transform text using Claude Code.

Basic Text Operations

bash

# Start Claude Code
claude

# Text format conversion
You: Convert README.txt to Markdown format

Claude: Let me read the file and convert the format...

Claude wants to:
  • Read file: README.txt
Allow? [y/n/always/never]

You: y

Claude: Content read, now converting to Markdown...

Claude wants to:
  • Write file: README.md
Allow? [y/n/always/never]

You: y

Claude: Conversion complete!

Created README.md, containing:
- Formatted headings
- List items
- Code blocks
- Links

Original file README.txt remains unchanged.

Batch Text Replacement

bash

You: Replace "Github" with "GitHub" in all Markdown files

Claude: Let me search for files that need replacement...

Claude wants to:
  • Run command: grep -r "Github" docs/ --include="*.md" -l
Allow? [y/n/always/never]

You: y

Claude: Found 15 files containing "Github":

- docs/guide/intro.md
- docs/api/reference.md
- docs/tutorial/setup.md
... (15 files total)

Claude wants to:
  • Run command: find docs -name "*.md" -exec sed -i '' 's/Github/GitHub/g' {} +
Allow? [y/n/always/never]

You: y

Claude: Replacement complete!

Replaced "Github" with "GitHub" in 15 files.
47 occurrences replaced in total.

Text Extraction

bash

You: Extract all error messages from logs/app.log

Claude: Let me extract the error logs...

Claude wants to:
  • Run command: grep "ERROR" logs/app.log
Allow? [y/n/always/never]

You: y

Claude: Found 23 error log entries:

[2026-04-16 10:23:45] ERROR: Database connection failed
[2026-04-16 10:24:12] ERROR: Invalid user token
[2026-04-16 11:15:33] ERROR: Payment processing timeout
[2026-04-16 12:45:21] ERROR: File not found: /data/users.csv
... (23 entries total)

Error type breakdown:
- Database errors: 8
- Authentication errors: 6
- Payment errors: 5
- File errors: 4

Would you like me to generate a detailed report?

Text Formatting

bash

You: Format data.json to make it more readable

Claude: Let me format the JSON file...

Claude wants to:
  • Read file: data.json
Allow? [y/n/always/never]

You: y

Claude: The current file is minified single-line JSON. I'll format it...

Claude wants to:
  • Run command: cat data.json | jq '.' > data_formatted.json
Allow? [y/n/always/never]

You: y

Claude: Formatting complete!

Original file: data.json (1 line, 15.2 KB)
New file: data_formatted.json (234 lines, 16.8 KB)

New file contains:
- Proper indentation
- Line breaks
- Better readability

Replace the original file?

You: y

Claude wants to:
  • Run command: mv data_formatted.json data.json
Allow? [y/n/always/never]

You: y

Claude: Original file replaced. data.json is now well-formatted.

Text Merging

bash

You: Merge all log files in the logs directory

Claude: Let me merge the log files...

Claude wants to:
  • Run command: find logs -name "*.log" -type f
Allow? [y/n/always/never]

You: y

Claude: Found 12 log files:

- logs/app.log (2.3 MB)
- logs/error.log (456 KB)
- logs/access.log (5.6 MB)
- logs/debug.log (1.2 MB)
... (12 files total)

Total size: 15.8 MB

Claude wants to:
  • Run command: cat logs/*.log > logs/combined.log
Allow? [y/n/always/never]

You: y

Claude: Merge complete!

Output file: logs/combined.log
Size: 15.8 MB
Contains: content from all 12 log files

Logs merged in filename order.

Text Splitting

bash

You: Split large_file.txt into smaller files of 1000 lines each

Claude: I'll split the file...

Claude wants to:
  • Run command: wc -l large_file.txt
Allow? [y/n/always/never]

You: y

Claude: File contains 5,678 lines.

Will split into 6 files (1000 lines each).

Claude wants to:
  • Run command: split -l 1000 large_file.txt large_file_part_
Allow? [y/n/always/never]

You: y

Claude: Split complete!

Created files:
- large_file_part_aa (1000 lines)
- large_file_part_ab (1000 lines)
- large_file_part_ac (1000 lines)
- large_file_part_ad (1000 lines)
- large_file_part_ae (1000 lines)
- large_file_part_af (678 lines)

6 files created in total.

Text Search and Statistics

bash

You: Count lines of code by programming language in the codebase

Claude: Let me count the lines of code...

Claude wants to:
  • Run command: find src -type f -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" -o -name "*.py" -o -name "*.go" | xargs wc -l
Allow? [y/n/always/never]

You: y

Claude: Code statistics:

TypeScript (.ts, .tsx):
- Files: 145
- Total lines: 23,456
- Average per file: 162 lines

JavaScript (.js, .jsx):
- Files: 34
- Total lines: 4,567
- Average per file: 134 lines

Python (.py):
- Files: 12
- Total lines: 2,345
- Average per file: 195 lines

Go (.go):
- Files: 8
- Total lines: 1,234
- Average per file: 154 lines

Totals:
- Files: 199
- Total lines: 31,602
- Codebase size: Medium

Primary language: TypeScript (74.2%)

Text Encoding Conversion

bash

You: Convert all files from GBK encoding to UTF-8

Claude: Let me detect and convert the encoding...

Claude wants to:
  • Run command: find . -name "*.txt" -type f
Allow? [y/n/always/never]

You: y

Claude: Found 23 text files.

Detecting encoding...

Claude wants to:
  • Run command: file -I *.txt
Allow? [y/n/always/never]

You: y

Claude: Encoding detection results:

GBK encoding: 8 files
UTF-8 encoding: 15 files

Files to convert:
- data/users_cn.txt
- docs/readme_cn.txt
- logs/app_cn.txt
... (8 files total)

Claude wants to:
  • Run command: iconv -f GBK -t UTF-8 data/users_cn.txt > data/users_cn_utf8.txt
  • Run command: mv data/users_cn_utf8.txt data/users_cn.txt
  [... repeated for other files]
Allow? [y/n/always/never]

You: y

Claude: Encoding conversion complete!

Converted 8 files from GBK to UTF-8.
All text files are now UTF-8 encoded.

Regular Expression Processing

bash

You: Extract all email addresses from the text

Claude: Let me extract emails using regex...

Claude wants to:
  • Run command: grep -oE '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}' data/contacts.txt
Allow? [y/n/always/never]

You: y

Claude: Extracted 45 email addresses:

john.doe@example.com
jane.smith@company.org
admin@website.net
support@service.io
... (45 total)

Claude wants to:
  • Write file: data/emails.txt
Allow? [y/n/always/never]

You: y

Claude: Email list saved to data/emails.txt

Statistics:
- Total: 45
- Unique emails: 42
- Duplicate emails: 3
- Domain distribution:
  - example.com: 12
  - company.org: 8
  - website.net: 6
  - Other: 19

Text Deduplication

bash

You: Remove duplicate lines from list.txt

Claude: Let me remove the duplicates...

Claude wants to:
  • Read file: list.txt
Allow? [y/n/always/never]

You: y

Claude: File analysis:

- Total lines: 1,234
- Unique lines: 892
- Duplicate lines: 342

Claude wants to:
  • Run command: sort list.txt | uniq > list_unique.txt
Allow? [y/n/always/never]

You: y

Claude: Deduplication complete!

Original file: list.txt (1,234 lines)
New file: list_unique.txt (892 lines)
Removed: 342 duplicate lines

Replace the original file?

You: y

Claude wants to:
  • Run command: mv list_unique.txt list.txt
Allow? [y/n/always/never]

You: y

Claude: Original file replaced. list.txt now contains only unique lines.

CSV Processing

bash

You: Extract the email column from users.csv and save as a new file

Claude: Let me extract the email column...

Claude wants to:
  • Read file: users.csv
Allow? [y/n/always/never]

You: y

Claude: CSV file structure:

Columns: id, name, email, age, city
Rows: 1,234

Claude wants to:
  • Run command: cut -d',' -f3 users.csv > emails.txt
Allow? [y/n/always/never]

You: y

Claude: Extraction complete!

Output file: emails.txt
Contains: 1,234 email addresses (including header)

Remove the header?

You: y

Claude wants to:
  • Run command: tail -n +2 emails.txt > emails_no_header.txt && mv emails_no_header.txt emails.txt
Allow? [y/n/always/never]

You: y

Claude: Header removed. emails.txt now contains only email addresses.

Markdown Processing

bash

You: Extract headings from all Markdown files and generate a table of contents

Claude: Let me extract the Markdown headings...

Claude wants to:
  • Run command: grep -h "^#" docs/**/*.md
Allow? [y/n/always/never]

You: y

Claude: Extracted 156 headings.

Generating table of contents structure...

Claude wants to:
  • Write file: docs/TABLE_OF_CONTENTS.md
Allow? [y/n/always/never]

You: y

Claude: Table of contents generated: docs/TABLE_OF_CONTENTS.md

Content preview:

```markdown
# Table of Contents

## Getting Started
- [Installation Guide](guide/installation.md)
- [Basic Usage](guide/basics.md)
- [Configuration](guide/configuration.md)

## API Reference
- [Authentication](api/auth.md)
- [User Management](api/users.md)
- [Order Processing](api/orders.md)

## Tutorials
- [First Application](tutorial/first-app.md)
- [Advanced Features](tutorial/advanced.md)

... (156 entries total)

Table of contents organized by folder and heading hierarchy.


## Log Analysis

```bash
You: Analyze access.log, show the top 10 IPs by visit count

Claude: Let me analyze the access log...

Claude wants to:
  • Run command: awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10
Allow? [y/n/always/never]

You: y

Claude: Top 10 IPs by visit count:

1. 192.168.1.100 - 5,234 visits
2. 10.0.0.50 - 3,456 visits
3. 172.16.0.25 - 2,890 visits
4. 192.168.1.101 - 2,345 visits
5. 10.0.0.51 - 1,987 visits
6. 172.16.0.26 - 1,654 visits
7. 192.168.1.102 - 1,432 visits
8. 10.0.0.52 - 1,234 visits
9. 172.16.0.27 - 1,098 visits
10. 192.168.1.103 - 987 visits

Total visits: 45,678
Unique IPs: 1,234

Would you like me to generate a detailed report?

Using Auto-Accept Mode

bash

# Press Shift+Tab to switch to Auto-Accept Mode
[Bottom shows: ⏵⏵ accept edits on]

You: Batch process all text files in the data directory, normalize formats

Claude: [Auto-executes]
Processing complete:

1. Converted encoding to UTF-8
2. Normalized line endings to LF
3. Removed trailing whitespace
4. Added trailing newlines

Files processed: 34
Total time: 2.3 seconds

Common Commands

bash

/model opus    # Use Opus for complex text analysis
/fast          # Use fast mode for simple operations
/cost          # Check usage cost

Best Practices

Back up original files: Back up before batch replacements
Test regex patterns: Test on a small range first
Check encoding: Confirm file encoding before processing
Use pipes: Combine multiple commands for efficiency
Save results: Save processing results to new files

Text Processing ​

Basic Text Operations ​

Batch Text Replacement ​

Text Extraction ​

Text Formatting ​

Text Merging ​

Text Splitting ​

Text Search and Statistics ​

Text Encoding Conversion ​

Regular Expression Processing ​

Text Deduplication ​

CSV Processing ​

Markdown Processing ​

Using Auto-Accept Mode ​

Common Commands ​

Best Practices ​

Text Processing

Basic Text Operations

Batch Text Replacement

Text Extraction

Text Formatting

Text Merging

Text Splitting

Text Search and Statistics

Text Encoding Conversion

Regular Expression Processing

Text Deduplication

CSV Processing

Markdown Processing

Using Auto-Accept Mode

Common Commands

Best Practices