arrow_back Back to Tickets

Ticket #a46e4d

open

Problem

Build a Python based interface that collects user conversion data and shows. User engaged. S: create a kpi page of user visit to your site. what are they clicking on. M: build a page that shows the users engagement of our site. here is an example. http://165.232.48.178/playgrounds/javascript_exercises/DOM/testinghtml/kpi1.html http://165.232.48.178/playgrounds/javascript_exercises/DOM/testinghtml/kpi2.html http://165.232.48.178/playgrounds/javascript_exercises/DOM/testinghtml/kpi3.html http://165.232.48.178/playgrounds/javascript_exercises/DOM/testinghtml/kpi4.html What are the minimum metrics that must appear on the KPI dashboard for the project to be considered “complete”? Example: “Unique visitors per day, top 10 clicked elements, total visits per page.” Do you want charts, tables, or both? Should the dashboard allow date-filtering (last 24 hours, last 7 days, custom range)? A: this will allow us to collect data on the activity of our website. we are building but we dont know if people are visiting your site. R: this data will help us pivot . we are flying blind with out this data. T: 1 week add 404 error to graphs. flag users who have too many 404 request. show all the request.

Creator

marcus

Priority

1

Urgent

urgent

Important

important

Project Name

zfrika

Question

how does this compare to google analytics. ? What are we measuring? Signup , sales, visitors vs signup, Which pages or sections of the website do you want to track first? Which specific user actions must be tracked? (choose all that apply) Page visits Button clicks Navigation clicks Form submissions Scroll depth Time on page Do you want real-time tracking or daily summaries? Will data be stored in MongoDB, MySQL, flat files, or something else? MEASURABLE What are the minimum metrics that must appear on the KPI dashboard for the project to be considered “complete”? Example: “Unique visitors per day, top 10 clicked elements, total visits per page.” Do you want charts, tables, or both? Should the dashboard allow date-filtering (last 24 hours, last 7 days, custom range)? ACHIEVABLE Will you be the only developer, or is this a team task? Do you already have tracking scripts installed on pages or do they need to be added from scratch? 4. RELEVANT What business question must this KPI dashboard answer? Example: “Which pages are users ignoring?” or “Where do users click most?” 5. TIME-BOUND Can you realistically finish both data collection + dashboard UI in 1 week? What is the exact deadline date?

Root Cause

# All possible insights from this PM2 log extract Below I’ve pulled out **actionable** observations and inferences you can make from the log lines you posted. I grouped them so you can quickly find operational, security, product, and analytics insights. --- ## 1) Traffic & client identity * Several **well-known crawler IPs** appear repeatedly (addresses in the `66.249.*` range = Googlebot; `207.46.*` / `40.77.*` look like Microsoft/Bing/Azure). That indicates automated indexing / crawling is happening. * Other IPs (e.g. `54.209.*`, `52.167.*`, `172.3.*`) are cloud provider ranges (AWS, Azure, Cloudflare/other) — could be legitimate services, bots, or cloud-hosted scrapers. * There are repeated requests from the **same IPs** to `/` and various endpoints → either normal repeated visits, a scripted crawler, or a scanner. ## 2) Endpoint popularity / usage * Root `/` is requested frequently and returns a consistent payload size (`- 6120`). Root is a top entry point. * Several REST-style endpoints with `?country=...` (e.g. `/rent?country=Uganda`, `/service?country=Gabon`) return large payloads (~`12522` bytes) and ~70–90 ms response times — these are clearly used and returning lists from the DB. * Some country-name page requests (e.g. `/Madagascar`, `/Uganda`, `/Tunisia`) return different sizes (`~14k`) — those pages are heavier and likely contain more content or different payload formatting. ## 3) Crawler / scraping patterns * Many sequential `?country=...` requests with different African country values suggests either: * a crawler iterating through country endpoints, or * a scripted data-collector or search engine following links. * The presence of Googlebot-style IPs hitting these country endpoints suggests indexing rather than targeted scraping, but the frequency and volume point to automated activity. ## 4) Performance metrics * Typical successful API responses to country-list endpoints appear in the **70–90 ms** range — acceptable for many applications. A few outliers: e.g. `476.761 ms` for `/foodcater?country=Rwanda` and `92.313 ms` for `/rent?country=Uganda`. Those are performance spikes worth investigating. * Many static or missing-resource requests are very fast (~1.5–3 ms), as expected for 404s. ## 5) Response codes — success vs missing * Many `200` responses for valid app endpoints (country queries and `/`). * **A huge number of `404` responses** for `.php` files, WordPress paths, filemanager names, admin pages. That indicates scanning/probing for common web shells, plugins, or misconfigured WordPress sites. * Some `304` (cached) responses are present (e.g. `^[[36m304`), indicating caching/conditional requests are working for some resources. ## 6) Security & reconnaissance signals * Repeated requests to names such as `wp_filemanager.php`, `tinyfilemanager.php`, `filemanager.php`, `wp-conflg.php` (misspelling of `wp-config.php`), `file9.php`, `sh3ll.php`, `/cgi-bin/`, `.well-known/.qiodetme.php`, `/.bod/.ll/ss.php` — strongly indicates **automated vulnerability scanning** and attempts to find web shells or exploitable admin endpoints. * The diversity of filenames and WP-related paths suggests a **broad opportunistic scan** (not targeted to a single plugin). * Multiple requests to `wp-admin` subpaths and to `wp-cron.php` variants are also consistent with automated WordPress exploit scanners. ## 7) Application correctness / bugs visible from logs * Some requests log `get list of <thing>` with `country: undefined` or `senior` with undefined — this suggests either: * your code is calling an endpoint without supplying the `country` param in some flows, or * user input or front-end code sometimes fails to set the parameter. This is a functional bug to fix. * Several 404s for static assets (e.g. `favicon.ico`, `apple-touch-icon.png`, `/playgrounds/css/public/styles_practice1.css`) indicate missing assets or incorrect paths — can affect UX and SEO. ## 8) Content / product insights * High number of hits to category endpoints for **Uganda** (and many other African countries) — Uganda appears to be a common country requested in this snippet. That suggests a user base or interest concentration by country (useful for content, localization, marketing). * The site serves many category lists (hair, village, healthcare, tutors, rent, services, etc.) — the product is a directory/listing platform. ## 9) Bot vs human behavior clues * Short, rapid, sequential 404 requests for many different filenames (often from the same IP) are **bot/scanner-like**. Humans generally don’t browse sequential random php file names. * Requests with meaningful `country=...` parameters and larger responses are more likely legitimate users or indexers/browsers. ## 10) Potential SEO / discovery issues * Frequent 404s on `login/*` and `about/*` for certain countries indicate broken internal links or incorrect route generation — these could harm SEO and user experience. * Missing `favicon.ico` and app icons (404s) are small but can be fixed. ## 11) Log hygiene / observability notes * Your logs contain colored terminal escape sequences (`^[[32m200^[[0m`) — fine for console, but may need stripping when ingesting logs into parsers or log stores. * You appear to be logging helpful debug lines like `base url printed:` and `User IP Address:` — good for tracing. ## 12) Storage & payload size observations * Payload sizes for list endpoints are consistently ~12522 bytes — indicates a standard result object set (likely a JSON list of records). If this is too large, consider pagination or limiting fields to reduce bandwidth. ## 13) Anomalies and outliers to investigate * Very high response time (476 ms) for a single request — look for DB query slowness or transient resource contention. * Repeated scanning requests that return 404s could escalate to attempted exploits; monitor for any that suddenly return 200. --- # Recommended actions (short and prioritized) 1. **Block / rate-limit** obvious scanner IPs or add them to a denylist / fail2ban rules (especially if they show repeated probing). 2. **Add a WAF (or tighten rules)** to block common exploit patterns (filemanager, wp-conflg, sh3ll, tinyfilemanager). 3. **Log user-agent and reverse-DNS / geo IP** (if not already) — will help separate legitimate crawlers (Googlebot/Bing) from malicious scanners. 4. **Fix application bugs** that produce `country: undefined` and broken `login/*` routes. 5. **Add pagination / reduce payload size** for large list endpoints if desired for bandwidth/performance. 6. **Serve a proper robots.txt** to instruct well-behaved crawlers what to index. 7. **Monitor and alert** on sudden increases in 404 scanning activity or on any of those previously-404ed filenames returning 200. 8. **Ensure static assets exist** (favicon, apple icons, CSS) to improve UX and avoid unnecessary 404 noise. --- If you want, I can: * summarize the most suspicious IPs and suggest iptables/fail2ban rules, * produce a quick regex to detect similar exploit probes in your logs, or * parse these lines into a CSV (endpoint, ip, status, latency, size, time) for analysis. Which of the above would you like next?

Notes

I have access to pm2 logs. there is a path to view data: http://165.232.48.178/plots/MOST_ACCESSED_URL/prod-app2-out__2025-04-03.png

Helpful Links

http://165.232.48.178/playgrounds/javascript_exercises/DOM/testinghtml/kpi1.html http://165.232.48.178/playgrounds/javascript_exercises/DOM/testinghtml/kpi2.html http://165.232.48.178/playgrounds/javascript_exercises/DOM/testinghtml/kpi3.html http://165.232.48.178/playgrounds/javascript_exercises/DOM/testinghtml/kpi4.html

Strategy

S — Specific You want to “create a KPI page of user visits and clicks on your site.” Good start, but it’s still too broad. What exact metrics? Which pages? Which user actions? M — Measurable You provided example dashboards (kpi1–4). That helps, but the goal should define success criteria. For example: Track unique visitors Track clicks per element Track time on page Track daily/weekly totals Right now it doesn't specify what “done” looks like.

Next Steps

http://165.232.48.178/zfrika_plots/ http://165.232.48.178/254kenya_plots/

Ticket Information

Ticket ID: 690d5c7d19dfd0d04fa46e4d
Date Initiated: 11/10/2025, 10:40:00 AM
Status: open
Urgent: urgent
Important: important
Created: 11/6/2025, 6:42:05 PM
Last Updated: 11/24/2025, 8:21:51 PM