Skip to content

Commit e5e7ebc

Browse files
authored
Harden data-update-partners workflow (#270)
* Harden data-update-partners workflow and add Last_Name fallback Add error handling, input validation, and safety checks to prevent empty data from overwriting good data. Add Last_Name field support for maintainers, falling back to Last_Name when First_Name is empty. * Fix curl error handling to work with set -euo pipefail Use conditional capture (if ! response=$(curl ...)) instead of $? check, which was unreachable under set -e. * Add days since last commit to maintainers JSON Call days_since_last_commit.py per maintainer to enrich the JSON with a Days_since_last_commit field. Add --days-only flag to the script for machine-readable output. * Limit days-since-last-commit to pinned repos only Use GitHub GraphQL API to fetch only pinned repositories instead of all org repos, reducing API calls and avoiding errors from large or problematic repositories. * Retry on server errors instead of warning Add exponential backoff retry (up to 3 attempts) for 500+ errors when fetching commit dates, silently returning None if all retries are exhausted. * Use Search Commits API for days since last commit Replace per-repo iteration with a single search API call per maintainer (author:{user} org:{org}), drastically reducing API usage and eliminating server error warnings. * Handle 403 rate limits with retry and backoff Fix Accept header, add retry with Retry-After for 403 responses, handle 422 validation errors, and return None instead of crashing when all retries are exhausted. * Add verbose logging for maintainer processing * Skip maintainers without GitHub profile, show name in logs Display First_Name and Last_Name alongside username in processing logs. Skip entries with null/empty GitHub field with a warning showing the maintainer's name. * Improve days-since-last-commit log output for no-commit users * Fix rate limiting: increase retries and add delay between calls Increase retry attempts to 5 with longer backoff (10s, 20s, 40s...), and add 2s delay between maintainer lookups to avoid triggering GitHub search API secondary rate limits. * Track last activity (commits, issues, comments) not just commits Search across commits, opened issues/PRs, and comments to find the most recent activity date per maintainer. Rename JSON field to Days_since_last_activity. Add 1s delay between search API calls. * Remove comment search from activity check The commenter search used issue updated_at which reflects when anyone last touched the issue, not when the user commented, giving false recent activity. Only track commits and opened issues/PRs which have reliable dates. * Fail job on partner fetch error instead of continuing with partial data * Fail fast on maintainers fetch error with clear error message * Make avatar lookup best-effort to avoid failing under pipefail * Distinguish lookup failures from no-activity in days_since script Return ERROR sentinel and exit 1 on API failures (rate limit exhaustion, HTTP errors, network errors). Reserve -1 for genuine no-activity. Workflow catches ERROR and falls back to -1 with a warning instead of crashing.
1 parent 7050481 commit e5e7ebc

2 files changed

Lines changed: 213 additions & 44 deletions

File tree

.github/workflows/data-update-partners-data.yml

Lines changed: 99 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ jobs:
2222
uses: actions/checkout@v6
2323
with:
2424
repository: armbian/armbian.github.io
25-
fetch-depth: 0
25+
fetch-depth: 1
2626
clean: false
2727
path: armbian.github.io
2828

@@ -33,26 +33,38 @@ jobs:
3333
REFRESH_TOKEN: ${{ secrets.ZOHO_REFRESH_TOKEN }}
3434
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
3535
run: |
36+
set -euo pipefail
37+
3638
ACCESS_TOKEN=$(curl -sH "Content-type: multipart/form-data" \
37-
-F refresh_token=$REFRESH_TOKEN \
38-
-F client_id=$CLIENT_ID \
39-
-F client_secret=$CLIENT_SECRET \
39+
-F "refresh_token=$REFRESH_TOKEN" \
40+
-F "client_id=$CLIENT_ID" \
41+
-F "client_secret=$CLIENT_SECRET" \
4042
-F grant_type=refresh_token \
4143
-X POST https://accounts.zoho.eu/oauth/v2/token \
4244
| jq -r '.access_token')
4345
46+
if [ -z "$ACCESS_TOKEN" ] || [ "$ACCESS_TOKEN" = "null" ]; then
47+
echo "::error::Failed to obtain Zoho access token"
48+
exit 1
49+
fi
50+
4451
echo "Access token obtained."
4552
4653
# Fetch all partners and combine into single JSON file
4754
> partners.json
4855
4956
for partner_type in Platinum Gold Silver; do
50-
curl --silent --request GET \
57+
if ! response=$(curl --silent --fail --request GET \
5158
--url "https://www.zohoapis.eu/bigin/v2/Accounts/search?fields=Website,Company_slug,Description,Account_Name,Email,Partnership_Status,Promoted&criteria=((Partnership_Status:equals:${partner_type}%20Partner)and(Promoted:equals:true))" \
52-
--header "Authorization: Zoho-oauthtoken $ACCESS_TOKEN" \
59+
--header "Authorization: Zoho-oauthtoken $ACCESS_TOKEN"); then
60+
echo "::error::Failed to fetch ${partner_type} partners"
61+
exit 1
62+
fi
63+
64+
echo "$response" \
5365
| jq -c ".data[] | select(.Company_slug != null and .Company_slug != \"\") | {
5466
Account_Name,
55-
logo_url: (\"https://cache.armbian.com/images/vendors/150/\(.Company_slug)-border.png\"),
67+
logo_url: (\"https://cache.armbian.com/images/vendors/150/\(.Company_slug).png\"),
5668
Website,
5769
Description,
5870
Partnership_Status: .Partnership_Status
@@ -63,37 +75,37 @@ jobs:
6375
jq -s . < partners.json > partners.json.tmp && mv partners.json.tmp partners.json
6476
jq . -S < partners.json > partners.json.tmp && mv partners.json.tmp partners.json
6577
78+
# Validate partners data
79+
partner_count=$(jq 'length' partners.json)
80+
if [ "$partner_count" -lt 1 ]; then
81+
echo "::error::Partners JSON is empty, refusing to commit"
82+
exit 1
83+
fi
84+
echo "Fetched $partner_count partners."
85+
6686
# Display content in step summary as markdown sections
6787
echo "# Partners" >> $GITHUB_STEP_SUMMARY
6888
echo "" >> $GITHUB_STEP_SUMMARY
6989
70-
# Platinum Partners
71-
echo "## Platinum" >> $GITHUB_STEP_SUMMARY
72-
echo '<p align=left>' >> $GITHUB_STEP_SUMMARY
7390
TIMESTAMP=$(date +%s)
74-
jq -r --arg ts "$TIMESTAMP" '.[] | select(.Partnership_Status == "Platinum Partner") |
75-
"<a href='\''\(.Website)'\''><img src='\''\(.logo_url)?v=\($ts)'\'' width='\''150'\'' alt='\''\(.Account_Name)'\'' style=\"display:block; border:0; height:auto;\"></a>"' \
76-
partners.json | tr '\n' ' ' >> $GITHUB_STEP_SUMMARY
77-
echo '</p>' >> $GITHUB_STEP_SUMMARY
78-
echo "" >> $GITHUB_STEP_SUMMARY
79-
80-
# Gold Partners
81-
echo "## Gold" >> $GITHUB_STEP_SUMMARY
82-
echo '<p align=left>' >> $GITHUB_STEP_SUMMARY
83-
jq -r --arg ts "$TIMESTAMP" '.[] | select(.Partnership_Status == "Gold Partner") |
84-
"<a href='\''\(.Website)'\''><img src='\''\(.logo_url)?v=\($ts)'\'' width='\''105'\'' alt='\''\(.Account_Name)'\'' style=\"display:block; border:0; height:auto;\"></a>"' \
85-
partners.json | tr '\n' ' ' >> $GITHUB_STEP_SUMMARY
86-
echo '</p>' >> $GITHUB_STEP_SUMMARY
87-
echo "" >> $GITHUB_STEP_SUMMARY
91+
for level in Platinum Gold Silver; do
92+
case $level in
93+
Platinum) width=150 ;;
94+
Gold) width=105 ;;
95+
Silver) width=75 ;;
96+
esac
97+
echo "## ${level}" >> $GITHUB_STEP_SUMMARY
98+
echo '<p align=left>' >> $GITHUB_STEP_SUMMARY
99+
jq -r --arg ts "$TIMESTAMP" --arg status "${level} Partner" --arg w "$width" \
100+
'.[] | select(.Partnership_Status == $status) |
101+
"<a href='\''\(.Website)'\''><img src='\''\(.logo_url)?v=\($ts)'\'' width='\''\($w)'\'' alt='\''\(.Account_Name)'\'' style=\"display:block; border:0; height:auto;\"></a>"' \
102+
partners.json | tr '\n' ' ' >> $GITHUB_STEP_SUMMARY
103+
echo '</p>' >> $GITHUB_STEP_SUMMARY
104+
echo "" >> $GITHUB_STEP_SUMMARY
105+
done
88106
89-
# Silver Partners
90-
echo "## Silver" >> $GITHUB_STEP_SUMMARY
91-
echo '<p align=left>' >> $GITHUB_STEP_SUMMARY
92-
jq -r --arg ts "$TIMESTAMP" '.[] | select(.Partnership_Status == "Silver Partner") |
93-
"<a href='\''\(.Website)'\''><img src='\''\(.logo_url)?v=\($ts)'\'' width='\''75'\'' alt='\''\(.Account_Name)'\'' style=\"display:block; border:0; height:auto;\"></a>"' \
94-
partners.json | tr '\n' ' ' >> $GITHUB_STEP_SUMMARY
95-
echo '</p>' >> $GITHUB_STEP_SUMMARY
96-
echo "" >> $GITHUB_STEP_SUMMARY
107+
# Install Python dependencies for days_since_last_commit script
108+
pip install --quiet requests
97109
98110
# Create a temporary file for maintainers_with_avatars.json
99111
temp_file=$(mktemp)
@@ -104,24 +116,59 @@ jobs:
104116
# Flag for commas between records
105117
first=1
106118
107-
# Fetch the maintainers from Zoho Bigin
108-
curl --silent --request GET \
109-
--url 'https://www.zohoapis.eu/bigin/v2/Contacts/search?fields=Team,First_Name,Github,Maintaining,Your_core_competences&criteria=((Tag:equals:maintainer)and(Inactive:equals:false))' \
110-
--header "Authorization: Zoho-oauthtoken $ACCESS_TOKEN" \
111-
| jq -c '.data[] | {First_Name, Github, Team, Maintaining, Your_core_competences}' \
119+
# Fetch the maintainers from Zoho Bigin (fail fast on error, same as partners fetch)
120+
if ! maintainers_response=$(curl --silent --fail --request GET \
121+
--url 'https://www.zohoapis.eu/bigin/v2/Contacts/search?fields=Team,First_Name,Last_Name,Github,Maintaining,Your_core_competences&criteria=((Tag:equals:maintainer)and(Inactive:equals:false))' \
122+
--header "Authorization: Zoho-oauthtoken $ACCESS_TOKEN"); then
123+
echo "::error::Failed to fetch maintainers from Zoho"
124+
exit 1
125+
fi
126+
127+
echo "$maintainers_response" \
128+
| jq -c '.data[] | {First_Name, Last_Name, Github, Team, Maintaining, Your_core_competences}' \
112129
| while read -r row; do
113130
# Extract GitHub username from the URL
114131
github_url=$(echo "$row" | jq -r '.Github')
115132
username=$(basename "$github_url")
133+
first_name=$(echo "$row" | jq -r '.First_Name // empty')
134+
last_name=$(echo "$row" | jq -r '.Last_Name // empty')
116135
117-
# Assume GH_TOKEN is exported as env var or injected from GitHub Actions secrets
118-
auth_header="Authorization: token $GH_TOKEN"
136+
# Skip entries without a valid GitHub username
137+
if [ -z "$username" ] || [ "$username" = "null" ] || [ "$username" = "." ]; then
138+
echo "::warning::Skipping maintainer '${first_name} ${last_name}' - no GitHub profile set"
139+
continue
140+
fi
119141
120142
# Fetch GitHub profile for avatar URL
121-
avatar_url=$(curl -s -H "$auth_header" "https://api.github.com/users/$username" | jq -r '.avatar_url')
143+
echo "Processing maintainer: $username (${first_name} ${last_name})"
144+
avatar_url=$(curl -s -H "Authorization: token $GH_TOKEN" "https://api.github.com/users/$username" | jq -r '.avatar_url' || true)
122145
123-
# Enrich Zoho data with GitHub avatar
124-
enriched=$(echo "$row" | jq --arg avatar "$avatar_url" '. + {Avatar: $avatar}')
146+
# Fall back to empty string if avatar fetch failed
147+
if [ "$avatar_url" = "null" ] || [ -z "$avatar_url" ]; then
148+
echo "::warning::Could not fetch avatar for $username"
149+
avatar_url=""
150+
fi
151+
152+
# Get days since last activity in Armbian org (commits, issues)
153+
echo "Fetching days since last activity for $username..."
154+
days_since=$(python3 armbian.github.io/scripts/days_since_last_commit.py --days-only armbian "$username" "$GH_TOKEN" || true)
155+
if [ "$days_since" = "ERROR" ] || [ -z "$days_since" ]; then
156+
echo "::warning::Lookup failed for $username, skipping activity data"
157+
days_since="-1"
158+
elif [ "$days_since" = "-1" ]; then
159+
echo " -> $username: no activity found in org"
160+
else
161+
echo " -> $username: ${days_since} days since last activity"
162+
fi
163+
164+
# Enrich Zoho data with GitHub avatar, days since last activity, and Last_Name fallback
165+
enriched=$(echo "$row" | jq --arg avatar "$avatar_url" --arg days "$days_since" '
166+
. + {Avatar: $avatar, Days_since_last_activity: ($days | tonumber)}
167+
| if (.First_Name == null or .First_Name == "") and (.Last_Name != null and .Last_Name != "")
168+
then .First_Name = .Last_Name
169+
else . end
170+
| del(.Last_Name)
171+
')
125172
126173
# Manage commas between JSON objects
127174
if [ $first -eq 1 ]; then
@@ -137,8 +184,16 @@ jobs:
137184
# Close the JSON array
138185
echo "]" >> "$temp_file"
139186
140-
# Format and save the final output to fixed.json
141-
cat "$temp_file" | jq . > maintainers.json
187+
# Format and save the final output
188+
jq . "$temp_file" > maintainers.json
189+
190+
# Validate maintainers data
191+
maintainer_count=$(jq 'length' maintainers.json)
192+
if [ "$maintainer_count" -lt 1 ]; then
193+
echo "::error::Maintainers JSON is empty, refusing to commit"
194+
exit 1
195+
fi
196+
echo "Fetched $maintainer_count maintainers."
142197
143198
# Clean up the temporary file
144199
rm "$temp_file"

scripts/days_since_last_commit.py

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
#!/usr/bin/env python3
2+
import sys
3+
import time
4+
import requests
5+
from datetime import datetime, timezone
6+
7+
API = "https://api.github.com"
8+
9+
10+
class LookupError(Exception):
11+
pass
12+
13+
14+
def search_with_retry(url, headers, params, retries=5):
15+
for attempt in range(retries):
16+
try:
17+
r = requests.get(url, headers=headers, params=params, timeout=30)
18+
except requests.exceptions.RequestException as e:
19+
raise LookupError(f"Request failed: {e}")
20+
21+
if r.status_code == 403:
22+
retry_after = int(r.headers.get("Retry-After", 10 * (2 ** attempt)))
23+
print(f"Rate limited, retrying in {retry_after}s (attempt {attempt + 1}/{retries})...", file=sys.stderr)
24+
time.sleep(retry_after)
25+
continue
26+
27+
if r.status_code == 422:
28+
return None
29+
30+
if not r.ok:
31+
raise LookupError(f"HTTP {r.status_code}")
32+
33+
return r.json()
34+
35+
raise LookupError("Rate limit retries exhausted")
36+
37+
38+
def get_latest_commit_date(org, user, headers):
39+
data = search_with_retry(
40+
f"{API}/search/commits", headers,
41+
{"q": f"author:{user} org:{org}", "sort": "author-date", "order": "desc", "per_page": 1},
42+
)
43+
if not data or data.get("total_count", 0) == 0:
44+
return None
45+
date_str = data["items"][0]["commit"]["author"]["date"]
46+
return datetime.fromisoformat(date_str.replace("Z", "+00:00"))
47+
48+
49+
def get_latest_issue_date(org, user, headers):
50+
data = search_with_retry(
51+
f"{API}/search/issues", headers,
52+
{"q": f"author:{user} org:{org}", "sort": "created", "order": "desc", "per_page": 1},
53+
)
54+
if not data or data.get("total_count", 0) == 0:
55+
return None
56+
date_str = data["items"][0]["created_at"]
57+
return datetime.fromisoformat(date_str.replace("Z", "+00:00"))
58+
59+
60+
def days_since_last_activity(org, user, token):
61+
headers = {
62+
"Accept": "application/vnd.github+json",
63+
"Authorization": f"Bearer {token}",
64+
"User-Agent": "org-activity-check",
65+
}
66+
67+
dates = []
68+
for fetch in (get_latest_commit_date, get_latest_issue_date):
69+
dt = fetch(org, user, headers) # raises LookupError on failure
70+
if dt:
71+
dates.append(dt)
72+
time.sleep(1)
73+
74+
if not dates:
75+
return None
76+
77+
latest = max(dates)
78+
return (datetime.now(timezone.utc) - latest).days
79+
80+
81+
def main():
82+
days_only = "--days-only" in sys.argv
83+
args = [a for a in sys.argv[1:] if a != "--days-only"]
84+
85+
if len(args) != 3:
86+
print("Usage: script.py [--days-only] ORG USER TOKEN")
87+
sys.exit(1)
88+
89+
org, user, token = args[0], args[1], args[2]
90+
91+
try:
92+
delta_days = days_since_last_activity(org, user, token)
93+
except LookupError as e:
94+
if days_only:
95+
print("ERROR")
96+
else:
97+
print(f"Lookup failed for {user}: {e}")
98+
sys.exit(1)
99+
100+
if delta_days is None:
101+
if days_only:
102+
print(-1)
103+
else:
104+
print(f"No activity found for {user} in org {org}")
105+
sys.exit(0)
106+
107+
if days_only:
108+
print(delta_days)
109+
else:
110+
print(f"Days since last activity: {delta_days}")
111+
112+
113+
if __name__ == "__main__":
114+
main()

0 commit comments

Comments
 (0)