Skip to content

HttpRoute creates upstream with server unix:/var/run/nginx/nginx-503-server.sock and throws 503 error #3139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Fak3 opened this issue Feb 15, 2025 · 14 comments
Labels
bug Something isn't working community
Milestone

Comments

@Fak3
Copy link

Fak3 commented Feb 15, 2025

HttpRoute creates upstream with server unix:/var/run/nginx/nginx-503-server.sock and throws 503 error

To Reproduce

1.install nginx-gateway:

kubectl apply -f https://raw.githubusercontent.com/nginx/nginx-gateway-fabric/v1.5.1/deploy/crds.yaml
kubectl apply -f https://raw.githubusercontent.com/nginx/nginx-gateway-fabric/v1.5.1/deploy/nodeport/deploy.yaml
  1. Create NodePort Service
apiVersion: v1
kind: Service
metadata:
  name: ufo-service
  namespace: ufo-ns
spec:
  type: NodePort
  selector:
    app: ufo-app
  ports:
    - protocol: TCP
      name: http
      port: 80
      targetPort: 8000  # inside container

      nodePort: 30007
  1. And HttpRoute
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  namespace: ufo-ns
  name: ufo-httproute
spec:
  parentRefs:
  - name: ufo-gateway
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /lol
    backendRefs:
    - name: ufo-service
      port: 80
  1. check that service itself can be successfuly accessed on that node port: curl http://vybory.live:30007 returns http 200.

  2. Attempt to access the service via nginx-gateway, notice it throws http 503:

$ curl http://vybory.live/lol

<html>
<head><title>503 Service Temporarily Unavailable</title></head>
<body>
<center><h1>503 Service Temporarily Unavailable</h1></center>
<hr><center>nginx</center>
</body>
</html>

  1. log into nginx caontainer with
kubectl exec -it --namespace nginx-gateway deployments/nginx-gateway -c nginx -- /bin/sh
  1. retrieve generated configuration with nginx -T and notice that upstream definition for my ufo-service points to server unix:/var/run/nginx/nginx-503-server.sock; instead of actual service ip:
upstream ufo-ns_ufo-service_80 {
    random two least_conn;
    zone ufo-ns_ufo-service_80 512k;

    server unix:/var/run/nginx/nginx-503-server.sock;
}

upstream ufo-ns_grafana_3000 {
    random two least_conn;
    zone ufo-ns_grafana_3000 512k;

    server 10.20.15.122:3000;
}
  1. Also notice that at the same time, grafana upstream correctly references its service ip, and can be accesed via http://grafana.vybory.live

See full nginx config: https://gist.github.com/Fak3/73abc3e2b0bdbe6c38f13c62dbc09531

Expected behavior
Expected: the upstream definition int the config should reference service by ip, instead of pointing to unix:/var/run/nginx/nginx-503-server.sock

Your environment

@Fak3
Copy link
Author

Fak3 commented Feb 15, 2025

For completeness, here is grafana service:

apiVersion: v1
kind: Service
metadata:
  name: grafana
spec:
  ports:
    - port: 3000
      protocol: TCP
      targetPort: http-grafana

      # Will be accessible at node ip which can be retieved with
      # > kubectl get node -o wide
      nodePort: 30008
  selector:
    app: grafana
  sessionAffinity: None
  type: NodePort

and its httproute:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  namespace: ufo-ns
  name: grafana-httproute
spec:
  parentRefs:
  - name: ufo-gateway
  hostnames:
  - "grafana.vybory.live"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /
    backendRefs:
    - name: grafana
      port: 3000

@sjberman
Copy link
Collaborator

Hi @Fak3, the 503 server is used when we can't find any endpoints for the Service. Can you confirm that you see endpoints for the ufo-app?

kubectl -n ufo-ns get endpoints

Could be an issue with the selector labels on the Service not matching the Deployment that it's fronting.

@mpstefan mpstefan added the waiting for response Waiting for author's response label Feb 19, 2025
@Fak3
Copy link
Author

Fak3 commented Feb 28, 2025

kubectl -n ufo-ns get endpoints

Endpoint is up and healthy:

kubectl -n ufo-ns get endpoints
NAME                  ENDPOINTS          AGE
grafana               10.244.0.16:3000   13d
node-exporter         10.244.0.26:9100   11d
postgres              10.244.0.24:5432   14d
prometheus-operated   10.244.0.9:9090    14d
ufo-service           10.244.0.23:8000   14d

@sjberman sjberman removed the waiting for response Waiting for author's response label Feb 28, 2025
@sjberman
Copy link
Collaborator

Hm, so something is happening where the control plane is not seeing the endpoints for that service. I'd be curious if there is anything in those logs that would indicate why it's not seeing them.

Are you able to check (or provide) debug logs for the nginx-gateway container? To update to debug level, see https://docs.nginx.com/nginx-gateway-fabric/how-to/control-plane-configuration/.

@sjberman
Copy link
Collaborator

sjberman commented Feb 28, 2025

Or even better, check the status of the ufo-httproute. kubectl -n ufo-ns describe httproute ufo-httproute.

@Fak3
Copy link
Author

Fak3 commented Feb 28, 2025

kubectl -n ufo-ns describe httproute ufo-httproute

Name:         ufo-httproute
Namespace:    ufo-ns
Labels:       app.kubernetes.io/instance=dev-kind-ufo
Annotations:  <none>
API Version:  gateway.networking.k8s.io/v1
Kind:         HTTPRoute
Metadata:
  Creation Timestamp:  2025-02-14T19:31:45Z
  Generation:          2
  Resource Version:    1082085
  UID:                 a90a65c9-d342-4b2d-b787-c62347f5e110
Spec:
  Hostnames:
    ufo.kind
  Parent Refs:
    Group:  gateway.networking.k8s.io
    Kind:   Gateway
    Name:   ufo-gateway
  Rules:
    Backend Refs:
      Group:   
      Kind:    Service
      Name:    ufo-service
      Port:    80
      Weight:  1
    Matches:
      Path:
        Type:   PathPrefix
        Value:  /
Status:
  Parents:
    Conditions:
      Last Transition Time:  2025-02-28T15:26:30Z
      Message:               The route is accepted
      Observed Generation:   2
      Reason:                Accepted
      Status:                True
      Type:                  Accepted
      Last Transition Time:  2025-02-28T15:26:30Z
      Message:               All references are resolved
      Observed Generation:   2
      Reason:                ResolvedRefs
      Status:                True
      Type:                  ResolvedRefs
    Controller Name:         gateway.nginx.org/nginx-gateway-controller
    Parent Ref:
      Group:      gateway.networking.k8s.io
      Kind:       Gateway
      Name:       ufo-gateway
      Namespace:  ufo-ns
Events:           <none>

@Fak3
Copy link
Author

Fak3 commented Feb 28, 2025

Hm, so something is happening where the control plane is not seeing the endpoints for that service. I'd be curious if there is anything in those logs that would indicate why it's not seeing them.

Are you able to check (or provide) debug logs for the nginx-gateway container? To update to debug level, see https://docs.nginx.com/nginx-gateway-fabric/how-to/control-plane-configuration/.

https://gist.github.com/Fak3/e9f2bc418cfe3cba7966fae43ba31780

@Fak3
Copy link
Author

Fak3 commented Feb 28, 2025

Hm, so something is happening where the control plane is not seeing the endpoints for that service. I'd be curious if there is anything in those logs that would indicate why it's not seeing them.
Are you able to check (or provide) debug logs for the nginx-gateway container? To update to debug level, see https://docs.nginx.com/nginx-gateway-fabric/how-to/control-plane-configuration/.

https://gist.github.com/Fak3/e9f2bc418cfe3cba7966fae43ba31780

I've just updated this gist to have full debug log after i changed loglevel to debug and did rollout restart: kubectl -n nginx-gateway rollout restart deployment nginx-gateway

@Fak3
Copy link
Author

Fak3 commented Feb 28, 2025

Hm, so something is happening where the control plane is not seeing the endpoints for that service. I'd be curious if there is anything in those logs that would indicate why it's not seeing them.
Are you able to check (or provide) debug logs for the nginx-gateway container? To update to debug level, see https://docs.nginx.com/nginx-gateway-fabric/how-to/control-plane-configuration/.

https://gist.github.com/Fak3/e9f2bc418cfe3cba7966fae43ba31780

I've just updated this gist to have full debug log after i changed loglevel to debug and did rollout restart: kubectl -n nginx-gateway rollout restart deployment nginx-gateway

And now, after rollout restart the httproute magically works, and my service is available. So the issue is that it does not work until you restart the nginx-gateway

@sjberman
Copy link
Collaborator

@Fak3 Definitely odd...I wonder if the endpoints weren't available at first and then were updated, and the control plane didn't see that update for some reason. That's never been an issue before, but I'm not really sure what happened in this case. Glad it's working for you now though!

@sjberman
Copy link
Collaborator

sjberman commented Mar 4, 2025

Closing for now, if this reappears consistently we can take another look.

@sjberman sjberman closed this as completed Mar 4, 2025
@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in NGINX Gateway Fabric Mar 4, 2025
@Fak3
Copy link
Author

Fak3 commented Mar 4, 2025

I hit it twice already. First on production cluster, when I reported the bug. Second time on my local kind cluster, 5 days ago. So i can reproduce it consistently.

@sjberman sjberman reopened this Mar 4, 2025
@github-project-automation github-project-automation bot moved this from ✅ Done to 🆕 New in NGINX Gateway Fabric Mar 4, 2025
@nginx-bot nginx-bot bot added the community label Mar 4, 2025
@nginx nginx deleted a comment from nginx-bot bot Mar 4, 2025
@sjberman
Copy link
Collaborator

sjberman commented Mar 4, 2025

Hm, okay. We'll have to dig a little deeper on this. I don't see any issues in the controller logs, and it is seeing the ufo-service endpoints (though not the Service object). Not sure if that means anything, just a quick observation.

@sjberman sjberman added the bug Something isn't working label Mar 14, 2025
Copy link
Contributor

This issue is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale Pull requests/issues with no activity label Mar 29, 2025
@sjberman sjberman removed the stale Pull requests/issues with no activity label Mar 31, 2025
@mpstefan mpstefan added this to the v2.1.0 milestone Apr 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community
Projects
Status: 🆕 New
Development

No branches or pull requests

3 participants