Skip to main content

pg X.Y is stuck stale for , current state stale+active+clean, last acting [N]

I got these states when I removed the last OSD assigned to a pool with size 1 in the crushmap. Of course, I didn’t have any precious data in it, but to avoid removing the pool I tried reassigning the pool to a new root and new OSDs through a the crusmap rule.

ceph health detail
HEALTH_WARN 9 pgs stale; 9 pgs stuck stale
pg 18.6 is stuck stale for 8422.941233, current state stale+active+clean, last acting [13]
pg 18.1 is stuck stale for 8422.941247, current state stale+active+clean, last acting [13]
pg 19.0 is stuck stale for 8422.941251, current state stale+active+clean, last acting [13]
pg 18.0 is stuck stale for 8422.941252, current state stale+active+clean, last acting [13]
pg 19.1 is stuck stale for 8422.941255, current state stale+active+clean, last acting [13]
pg 18.3 is stuck stale for 8422.941254, current state stale+active+clean, last acting [13]
pg 19.2 is stuck stale for 8422.941258, current state stale+active+clean, last acting [13]
pg 18.2 is stuck stale for 8422.941259, current state stale+active+clean, last acting [13]
pg 19.3 is stuck stale for 8422.941263, current state stale+active+clean, last acting [13]

The Pgs show that the their last acting and removed OSD was number 13 and indeed, this OSD no longer exists in the cluster.

If I try querying the pg:

# ceph pg 18.6 query
Error ENOENT: i don't have pgid 18.6

The data insight those pgs is not valid so I tried recreating the pg:

# ceph pg force_create_pg 18.6
pg 18.6 now creating, ok

Remember that I reassigned the pool to a new root in the crushmap, so there are many OSDs available for the pool. But now, the PG is stuck with the state “creating” forever:

pg 18.6 is stuck inactive since forever, current state creating, last acting []

I supposed that the problem was with the Pg number of the pool, I thought that the pool couldn’t create more PGs because of the its Pg number.
I tried increasing the pool pg number and finally the PGs where created ok.

I follow these steps for documentation purposes, but if you don’t mind the data insight the pool the best option should be remove the pool and create it again.

 

One thought to “pg X.Y is stuck stale for , current state stale+active+clean, last acting [N]”

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.