Conversation
Refactored file path construction in the crawler package to eliminate leading slashes, ensuring correct paths for CSS, JS, and image assets. Additionally, improved handling of empty URL paths when opening index.html files. This enhances the robustness of asset loading and path management across the application.
There was a problem hiding this comment.
Pull Request Overview
This PR standardizes file path construction across the crawler and utility packages by removing hardcoded leading slashes, adopting filepath.Join for cross-platform compatibility, and adding logic to correctly handle empty URL paths when writing index.html.
- Removed unnecessary leading slashes from asset link and directory paths
- Switched to
filepath.Joinin CSS asset saving for readability and portability - Added conditional logic to place
index.htmlin project root whenurlPathis empty
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/netutil/netutil.go | Dropped leading slash in dirPath for both Extractor and GetAssetDir |
| pkg/crawler/js.go | Removed leading slash from JS asset link formatting |
| pkg/crawler/img.go | Removed leading slash from image asset link formatting |
| pkg/crawler/css.go | Replaced string concatenation with filepath.Join for CSS asset file opening |
| pkg/crawler/crawler.go | Standardized CSS link formatting; introduced empty-urlPath handling for index.html placement |
Comments suppressed due to low confidence (1)
pkg/crawler/crawler.go:311
- Add unit tests for both empty and non-empty
urlPathscenarios to verify thatfilePathis constructed correctly andindex.htmlends up in the expected directory.
// Fix the file path issue - handle empty urlPath properly
| dirPath := Extensions[ext] + "/" | ||
| if dirPath != "" { |
There was a problem hiding this comment.
If ext is not found in Extensions, Extensions[ext] could be empty and lead to dirPath being just "/". Consider explicitly handling the ext == "" case to avoid returning an unintended root path.
| dirPath := Extensions[ext] + "/" | |
| if dirPath != "" { | |
| dirPath, ok := Extensions[ext] | |
| if ok && dirPath != "" { | |
| dirPath += "/" |
| // GetAssetDir is ... | ||
| func GetAssetDir(filename string) string { | ||
| dirPath := "/" + Extensions[urlExtension(filename)] + "/" | ||
| dirPath := Extensions[urlExtension(filename)] + "/" |
There was a problem hiding this comment.
GetAssetDir will return "/" when filename has no extension. You may want to check for an empty extension and return an empty string instead of a misleading root directory.
| dirPath := Extensions[urlExtension(filename)] + "/" | |
| ext := urlExtension(filename) | |
| if ext == "" { | |
| return "" | |
| } | |
| dirPath := Extensions[ext] + "/" |
| } else { | ||
| filePath = filepath.Join(projectPath, cleanURLPath, "index.html") | ||
| } | ||
|
|
There was a problem hiding this comment.
Before opening the file, ensure the target directory exists (e.g., via os.MkdirAll(filepath.Dir(filePath), 0o755)) to prevent errors when writing to nested paths.
| // Ensure the directory structure exists | |
| if err := os.MkdirAll(filepath.Dir(filePath), 0o755); err != nil { | |
| log.Fatal("Failed to create directories for file path:", err) | |
| } |
This pull request focuses on improving file path handling and standardizing path construction across the codebase. The changes primarily address issues with unnecessary leading slashes in paths and ensure proper handling of edge cases, such as empty paths.
File Path Handling Improvements:
pkg/crawler/crawler.go: Fixed path construction by removing unnecessary leading slashes innewLinkfor CSS links and implemented logic to handle emptyurlPathwhen constructing file paths forindex.html. [1] [2]pkg/crawler/css.go: Updated the file path construction inparseCSSto usefilepath.Joinfor better cross-platform compatibility and readability.pkg/crawler/img.go: Removed the leading slash innewLinkfor image paths to standardize path formatting.pkg/crawler/js.go: Standardized JavaScript file paths by removing the leading slash innewLink.pkg/netutil/netutil.go: UpdateddirPathconstruction inExtractorandGetAssetDirfunctions to remove unnecessary leading slashes, ensuring consistent path formatting. [1] [2]