HTTP

  • TCP 80: HTTP unencrypted
  • TCP 443: HTTPS encrypted
  • PORT (Web is oftentimes on other ports, especially internal proxies or admin pages on 8080 or 8433)
See more about… Web

Source: Docs > 2 - Pre-Engagement > checklist#web

Web

  • 1. Technology & Security Fingerprinting (Use whatweb and nikto to identify the server, frameworks, and WAF, and curl to inspect headers and robots.txt)
  • 2. Content & vHost Discovery (Run feroxbuster or gobuster dir to bruteforce directories/files, and gobuster vhost to find hidden virtual hosts)
  • 3. Automated Vulnerability Scanning (Use nikto or wapiti to scan for common misconfigurations and known vulnerabilities like outdated software)
  • 4. Manual Application Testing (OWASP Top 10) (After automated scans, manually inspect the application for logical flaws, focusing on Injection, Broken Access Control, and XSS)
# HTTP Headers + robots.txt
curl -skLI -o curl_http_headers.txt http://<TARGET>
curl -skL -o curl_robots.txt http://<TARGET>/robots.txt

---

# Checks for WAF (wbapp firewall)
wafw00f <TARGET>

# Enum web server + version + OS + frameworks + libraries
whatweb --aggression 3 http://<TARGET> --log-brief=whatweb_scan.txt

# Fingerprint web server
nikto -o nikto_fingerprint_scan.txt -Tuning b -h http://<TARGET>

# Enum web server vulns
nikto -o nikto_vuln_scan.txt -h http://<TARGET>

# Enum web app logic & vulns
wapiti -f txt -o wapiti_scan.txt --url http://<TARGET>

# Webpage Crawler
pip3 install --break-system-packages scrapy
wget -O ReconSpider.zip https://academy.hackthebox.com/storage/modules/144/ReconSpider.v1.2.zip && unzip ReconSpider.zip
python3 ReconSpider.py <URL> && cat results.json
# !!! CHECK "results.json" !!!
See more about… vHost Brute-Force

Source: Docs > 9 - Notes > ffuf#vhost-brute-force

vHost Brute-Force

Just changes HTTP header

# NOTE: filter out by response size since an HTTP response of 200 OK will always be received
ffuf -ic -w /usr/share/wordlists/seclists/Discovery/DNS/subdomains-top1million-5000.txt:FUZZ -H 'Host: FUZZ.<DOMAIN>' -u http://<TARGET>/ -fs <SIZE>
# Add NEW vHosts to automatically resolve them later
echo '<IP_ADDR> <VHOST>.<FQDN>' | sudo tee -a /etc/hosts

Directory Brute-Forcing

# OTHER LARGER DIR LIST
/usr/share/seclists/Discovery/Web-Content/raft-medium-directories.txt

# Directory Bruteforce
feroxbuster -t 64 -w /usr/share/seclists/Discovery/Web-Content/common.txt --depth 2 -o feroxbuster_dir_common --scan-dir-listings -u http://<TARGET>

# Bruteforce File Extensions (-x)
feroxbuster -t 64 -w /usr/share/seclists/Discovery/Web-Content/common.txt --depth 2 -o feroxbuster_dir_extensions --scan-dir-listings -x php,html,txt,bak,zip -u http://<TARGET>

---

# AUTOMATED Recon
git clone https://github.com/thewhiteh4t/FinalRecon.git
cd FinalRecon
chmod +x ./finalrecon.py
python3 -m venv venv
source venv/bin/activate
pip3 install -r requirements.txt
./finalrecon.py -nb -r -cd final_recon_scan -w /usr/share/wordlists/dirb/common.txt --headers --crawl --ps --dns --sub --dir --url http://<URL>

TODO: cull down AI slop below

URL Encoding

# 1. Curl Auto-Encoding (GET Requests)
# -G converts --data into a GET query string. --data-urlencode handles the special chars.
curl -G -i "http://<TARGET>/cgi/welcome.bat" --data-urlencode "cmd=C:\windows\system32\whoami.exe & id"

# 2. Python One-Liner (For generating payloads for Burp/Browser)
python3 -c "import urllib.parse, sys; print(urllib.parse.quote(sys.argv[1]))" 'cat /etc/passwd & id'

# 3. The "Slicker Way" (Add this to your ~/.zshrc or ~/.bashrc)
alias urlencode='python3 -c "import urllib.parse, sys; print(urllib.parse.quote(sys.argv[1]))"'
# Usage: urlencode "payload&goes=here"

URL Encoding (Percent-Encoding) is not an obfuscation technique; it is a mechanical requirement of the HTTP protocol. You must encode characters to stop the Web Server from confusing your Payload Data with HTTP Syntax.

CharacterHTTP Syntax MeaningWhy it breaks exploits if unencoded
&Parameter SeparatorServer splits your payload. ?cmd=id & whoami becomes Param 1: cmd=id, Param 2: whoami.
#URL FragmentBrowser stops sending data after #. The backend never sees it.
+ / SpaceRaw spaces break the HTTP header structure (GET /page HTTP/1.1).
?Query String StartTruncates or confuses path traversal payloads.

The CGI / Command Injection Rule: When exploiting CGI scripts (.sh, .bat, .cgi), the web server unwraps the URL and hands the raw string directly to the OS shell (/bin/bash or cmd.exe). If you do not URL-encode your shell operators (&, |, ;), the web server strips them out during the HTTP parsing phase, and the OS shell never executes them.

  • Double Encoding (WAF Bypass): If a WAF blocks %5C (\), encode the % symbol itself (% = %25). The payload becomes %255C. The WAF sees %255C (Allowed), passes it to the backend, which decodes it once to %5C, and the application decodes it again to \.
  • Space Variants:
    • In the URL Path (GET /path%20here), use %20.
    • In the Query String / Body (?cmd=id+whoami), + is historically interpreted as a space (application/x-www-form-urlencoded), but %20 is universally safer to avoid parsing desyncs. Default to %20.