Leveraging PHP Fibers for Concurrent Web Scraping: A Real-World Example
Concurrency has always been a challenging aspect in PHP due to its traditionally synchronous and blocking nature. However, with the introduction of Fibers in PHP 8.1, developers now have a powerful tool to handle concurrency more efficiently. Fibers allow for non-blocking code execution, enabling you to write asynchronous code in a more straightforward and readable manner.
In this article, we’ll dive deep into what PHP Fibers are, how they work, and provide a real-world example of how you can leverage them to improve the performance and responsiveness of your PHP applications — specifically, using Fibers for concurrent web scraping.
What Are PHP Fibers?
Fibers in PHP are a lightweight mechanism for managing concurrency. They allow you to pause and resume code execution at specific points, which is particularly useful in scenarios where you want to handle multiple tasks simultaneously without blocking the execution flow.
Key Concepts:
- Concurrency: Running multiple tasks at the same time. Fibers enable concurrent tasks without blocking the main thread.
- Blocking vs. Non-Blocking: Blocking code execution waits for tasks to complete before proceeding, while non-blocking code can continue running other tasks in the meantime.
How Do Fibers Work in PHP?
Fibers work by creating execution contexts that can be paused and resumed. Unlike traditional multi-threading or multi-processing, Fibers do not run in parallel but rather allow you to switch between different execution points within the same thread.
Here’s a simple example to illustrate how Fibers work:
$fiber = new Fiber(function () {
echo "Fiber started\n";
Fiber::suspend();
echo "Fiber resumed\n";
});
echo "Starting fiber\n";
$fiber->start();
echo "Main code continues\n";
$fiber->resume();
echo "Main code ends\n";Explanation:
- Fiber::suspend(): Pauses the execution of the fiber.
- $fiber->resume(): Resumes the execution from where it was suspended.
Output:
Starting fiber
Fiber started
Main code continues
Fiber resumed
Main code endsReal-World Example: Concurrent Web Scraping with PHP Fibers
To illustrate the practical use of Fibers, let’s consider a scenario where you need to scrape data from multiple websites. Traditionally, you would perform these HTTP requests sequentially, which could be time-consuming. With PHP Fibers, you can fetch data from multiple websites concurrently, drastically improving efficiency.
Scenario:
You are building a web scraper that needs to collect data from several websites. Instead of waiting for each HTTP request to complete before starting the next one, you can use Fibers to handle multiple requests concurrently.
Implementation:
class WebScraper
{
private $urls;
public function __construct(array $urls)
{
$this->urls = $urls;
}
public function fetchAll()
{
$fibers = [];
$responses = [];
// Create a Fiber for each URL request
foreach ($this->urls as $url) {
$fibers[] = new Fiber(function () use ($url, &$responses) {
echo "Fetching data from: $url\n";
$responses[$url] = $this->fetch($url);
Fiber::suspend(); // Pause execution until the response is ready
});
}
// Start all fibers to initiate the HTTP requests
foreach ($fibers as $fiber) {
$fiber->start();
}
// Resume all fibers to collect the responses
foreach ($fibers as $fiber) {
$fiber->resume();
}
return $responses;
}
private function fetch(string $url)
{
// Simulate an HTTP request with a delay
sleep(2); // Replace with actual HTTP request logic, e.g., using file_get_contents or curl
return "Data from $url";
}
}
// Example usage
$urls = [
'https://example.com',
'https://another-example.com',
'https://yet-another-example.com',
];
$scraper = new WebScraper($urls);
$responses = $scraper->fetchAll();
print_r($responses);Explanation:
- Fiber Creation: For each URL, we create a new Fiber that handles the HTTP request.
- Concurrent Execution: Each Fiber starts the HTTP request and then suspends itself using
Fiber::suspend()while waiting for the response. - Resuming Execution: Once all requests are initiated, we resume each Fiber to process the responses concurrently.
Fetching data from: https://example.com
Fetching data from: https://another-example.com
Fetching data from: https://yet-another-example.com
Array
(
[https://example.com] => Data from https://example.com
[https://another-example.com] => Data from https://another-example.com
[https://yet-another-example.com] => Data from https://yet-another-example.com
)In this example, the web scraper is able to fetch data from multiple URLs concurrently, significantly reducing the total time required to scrape all the sites.
Advantages of Using PHP Fibers
Using Fibers in PHP offers several advantages, particularly in scenarios that involve I/O-bound tasks like web scraping:
- Improved Performance: By handling multiple tasks concurrently, you reduce the overall time needed to complete them, leading to better performance and responsiveness.
- Simplified Asynchronous Code: Fibers allow you to write asynchronous code in a more straightforward and readable manner compared to traditional callback-based approaches.
- Flexibility: Fibers can be used in various scenarios, including asynchronous I/O operations, cooperative multitasking, and simplifying complex logic.
Conclusion
PHP Fibers provide a powerful way to handle concurrency, especially in scenarios like web scraping where multiple I/O-bound tasks need to be performed simultaneously. By implementing Fibers, you can significantly improve the efficiency and performance of your PHP applications.
If you’re looking to optimize your PHP code for better concurrency and responsiveness, experimenting with Fibers is a great place to start. As demonstrated in this article, Fibers can make complex tasks more manageable and lead to substantial performance gains.
Did you find this guide on PHP Fibers helpful? Share it with your fellow developers and let’s spread the knowledge! Have you used Fibers in your projects? I’d love to hear your experiences — drop a comment below. Don’t forget to subscribe to my newsletter for more PHP tips, tutorials, and insights!






