In what way does P0 know that the data it wants would be stored in the memory that belongs to P1 (So that the first step is to request from P1)?
The directory can be implemented as a distributed hash table. So each memory address maps to a specific directory in the system. The mapping is established statically.
On this specific system, the directory corresponds to its associated memory. So processor 0 is requesting memory from the memory bank attached to processor 1. The directory with the memory then determines that the request should be forwarded to processor 2 rather than serviced.