Add AI_POLICY for clarification how to use AI agents#1740
Conversation
There was a problem hiding this comment.
Pull request overview
Adds project documentation and process guardrails for AI-assisted contributions, including a new AI policy doc, agent guidance for working in the repository, and a PR-template disclosure checkbox.
Changes:
- Add
AI_POLICY.mddescribing expectations for responsible AI-assisted development. - Add
AGENTS.mdwith repository map and rules for agent-style coding tools. - Update the PR template and README to surface AI policy/disclosure.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| README.md | Adds a link to the new AI policy in the contribution resources table. |
| AI_POLICY.md | New AI policy document for AI-assisted development expectations. |
| AGENTS.md | New guidance for agent tools working in this repo (layout, principles, workflow). |
| .github/pull_request_template.md | Adds an AI-assisted code generation disclosure checkbox. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| ### 6. LLM code review | ||
|
|
||
| So far, it is not possible to use paid LLM models for code review in open source ASF projects. However, one could use personal licenses for LLMs to do the same. |
There was a problem hiding this comment.
I think we need to improve the description here, like:
Some AI review tools (for example, GitHub Copilot review or CodeRabbit)
may not currently be enabled for ASF-hosted repositories due to operational, budget, or permission considerations. Contributors may still use personal AI tools locally, but remain responsible for code quality, licensing, and review outcomes.
|
|
||
| - [ASF Generative Tooling Guidance](https://www.apache.org/legal/generative-tooling.html) - Official Apache guidance on AI tool usage | ||
| - [GitHub Copilot](https://github.com/features/copilot) - AI pair programmer and code reviewer we use | ||
| - [LLM Leaderboard](https://llm-stats.com/) - LLM Stats Score, it's better to use high-ranked models No newline at end of file |
|
Hi @leborchuk, I don't think that adding a project-level My first concern is that these files are not just passive documentation. For tools like Codex or Claude Code, project-level instruction files can be loaded automatically and affect every local AI session in this repository. As far as Codex is concerned, My second concern is the content. A good project-level agent file should mainly capture hard project constraints and practical project knowledge: required license/header rules, build and test commands, high-risk subsystems, generated files ... It should constrain or assist the agent where the project has real requirements. It should not try to tune the agent's general coding style, personality, or workflow. Many items in the proposed @my-ship-it Please consider this change seriously. |
|
AGENTS.md is not suitable for committing to git, as it is platform- and user-specific. |
See detailed discussion in https://lists.apache.org/thread/3kq1391n3n0rzo0wchygmt0cyy59rzq9 As for the discussion results I've added: 1. AI_POLICY.md - note for the developer what using AI agents means 2. AGENTS.md - description for LLM models how to work with project code 3. .github/pull_request_template.md - new flag "This PR contains AI-assisted code generation" 4. README.md - link to policy from README file
|
Thank you, got it, remove AGENTS.md |
tuhaihe
left a comment
There was a problem hiding this comment.
One more small suggestion: it may be good to follow a formatting style similar to README.md or CONTRIBUTING.md, especially for line wrapping. Limiting line length to around 76–80 characters would make the document more consistent with other markdown documentation and improve readability in terminal/editor views.
|
|
||
| ### 6. LLM code review | ||
|
|
||
| So far, it is not possible to use paid LLM models for code review in open source ASF projects. However, one could use personal licenses for LLMs to do the same. |
There was a problem hiding this comment.
I think we need to improve the description here, like:
Some AI review tools (for example, GitHub Copilot review or CodeRabbit)
may not currently be enabled for ASF-hosted repositories due to operational, budget, or permission considerations. Contributors may still use personal AI tools locally, but remain responsible for code quality, licensing, and review outcomes.
|
A couple of additional suggestions that may help improve the overall developer workflow around AI-assisted contributions:
For example:
This could provide lightweight transparency/provenance information without making AI disclosure a strict requirement. If this direction looks reasonable, we could also update
This would position it more as a reference/example template rather than a required project file, while still helping contributors quickly get started with AI-assisted development practices. Some other ASF projects also ship |
tuhaihe
left a comment
There was a problem hiding this comment.
Just left some new comments. Thanks again!
|
|
||
| ## Project overview | ||
|
|
||
| Apache Cloudberry is an Apache Incubator project and an open-source massively parallel processing database. It evolved from Greenplum Database and is built on a PostgreSQL kernel. It is used for data warehouse, large-scale analytics, and AI or ML workloads. |
There was a problem hiding this comment.
| Apache Cloudberry is an Apache Incubator project and an open-source massively parallel processing database. It evolved from Greenplum Database and is built on a PostgreSQL kernel. It is used for data warehouse, large-scale analytics, and AI or ML workloads. | |
| Apache Cloudberry is an Apache Incubator project and an open-source massively parallel processing database. It evolved from Greenplum Database and is built on a modern PostgreSQL kernel. It is used for data warehouse, large-scale analytics, and AI or ML workloads. |
|
|
||
| - Keep changes as small and direct as possible. | ||
| - Do not perform broad code refactoring. Cloudberry's core is PostgreSQL-based, and unnecessary refactoring makes familiar code harder for maintainers to recognize and review. | ||
| - Preserve PostgreSQL and Greenplum coding style in the area being edited. |
There was a problem hiding this comment.
| - Preserve PostgreSQL and Greenplum coding style in the area being edited. | |
| - Preserve PostgreSQL and Cloudberry coding style in the area being edited. |
|
|
||
| - [README.md](README.md) — project introduction, community links, contribution overview, and license information. | ||
| - [CONTRIBUTING.md](CONTRIBUTING.md) — contribution expectations and community guidance. | ||
| - [AI_POLICY.md](AI_POLICY.md) — rules for AI-assisted development. |
There was a problem hiding this comment.
| - [AI_POLICY.md](AI_POLICY.md) — rules for AI-assisted development. | |
| - [AI_GUIDELINE.md](AI_GUIDELINE.md) — rules for AI-assisted development. |
|
|
||
| ## AI-assisted contribution policy | ||
|
|
||
| Follow [AI_POLICY.md](AI_POLICY.md): |
There was a problem hiding this comment.
| Follow [AI_POLICY.md](AI_POLICY.md): | |
| Follow [AI_GUIDELINE.md](AI_GUIDELINE.md): |
| - AI-assisted changes must pass normal review, testing, and CI standards. | ||
| - The contributor must ensure license compatibility. | ||
| - Significant AI-generated code should be disclosed using the PR template checkbox. | ||
| - Do not use AI to auto-generate responses to maintainer review feedback. |
There was a problem hiding this comment.
Do not use AI to auto-generate responses to maintainer review feedback.
Maybe we need to keep the description aligned with the new words in the AI guidelines.
| - Confirm documentation updates when needed. | ||
| - Confirm security review consideration. | ||
| - Disclose significant AI-assisted code generation. | ||
|
|
There was a problem hiding this comment.
Can add the guidelines on the commit message, like:
## Commit Conventions
- Add the standard Apache License header for the newly created files (no need for the third-party files).
- When drafting the commit message, please take the [.gitmessage](.gitmessage) template as a reference.
- ...
See detailed discussion in https://lists.apache.org/thread/3kq1391n3n0rzo0wchygmt0cyy59rzq9
As for the discussion results I've added:
AI_GUIDELINE.md- note for the developer what using AI agents meansAGENTS.md.template- template to create your ownAGENTS.mdfile.github/pull_request_template.md- new flag "This PR contains AI-assisted code generation"README.md- link to guideline from README fileFixes #ISSUE_Number
What does this PR do?
Type of Change
Breaking Changes
Test Plan
make installcheckmake -C src/test installcheck-cbdb-parallelImpact
Performance:
User-facing changes:
Dependencies:
Checklist
Additional Context
CI Skip Instructions