EfficientDet: Towards Scalable And Efficient Object Detection

2025 Author: Ian Gardner | [email protected]. Last modified: 2025-06-01 06:34

As one of the main applications in computer vision, object detection is becoming increasingly important in scenarios that require high precision but have limited computing resources, such as robotics and driverless cars. Unfortunately, many modern high-precision detectors do not meet these limitations. More importantly, real-world object detection applications run on different platforms, which often require different resources.

So the natural question is how to design accurate and efficient object detectors that can also adapt to a wide range of resource constraints?

EfficientDet: Scalable and Efficient Object Detection, adopted at CVPR 2020, introduces a new family of scalable and efficient object detectors. Building on previous work on scaling neural networks (EfficientNet) and incorporating a new bi-directional functional network (BiFPN) and new scaling rules, EfficientDet achieves modern accuracy while 9 times smaller and uses significantly less computation than known modern detectors. The following figure shows the general network architecture of the models.

Optimizing Model Architecture

The idea behind EfficientDet stems from an effort to find solutions to improve computational efficiency by systematically examining previous state-of-the-art detection models. In general, object detectors have three main components: a backbone that extracts features from a given image; a network of objects that takes multiple levels of functions from the backbone as input and outputs a list of combined functions that represent characteristic characteristics of the image; and a final class / box network that uses combined functions to predict the class and location of each object.

After reviewing the design options for these components, we identified several key optimizations to improve performance and efficiency. Previous detectors mostly use ResNets, ResNeXt or AmoebaNet as backbones, which are either less powerful or have lower efficiency than EfficientNets. With the initial implementation of the EfficientNet backbone, much more efficiency can be achieved. For example, starting with a RetinaNet baseline that uses a ResNet-50 backbone, our ablation study shows that simply replacing ResNet-50 with EfficientNet-B3 can improve accuracy by 3% while reducing computation by 20%. Another optimization is to improve the efficiency of functional networks. While most of the previous detectors simply use the Downstream Pyramid Network (FPN), we find that the downstream FPN is inherently limited to a one-way flow of information. Alternative FPNs such as PANet add additional upstream at the cost of additional computation.

Recent attempts to use neural architecture search (NAS) have found a more complex NAS-FPN architecture. However, while this network structure is effective, it is also irregular and highly optimized for a specific task, making it difficult to adapt to other tasks. To solve these problems, we propose a new network of bi-directional functions BiFPN, which implements the idea of combining multi-layer functions from FPN / PANet / NAS-FPN, which allows information to be transmitted both from top to bottom and from bottom to top. using regular and effective connections.

To further improve efficiency, we propose a new fast normalized synthesis technique. Traditional approaches usually treat all inputs to FPN the same way, even at different resolutions. However, we observe that input features with different resolutions often contribute unequally to the output functions. Thus, we add extra weight to each input function and let the network learn the importance of each of them. We will also replace all regular convolutions with less expensive, deeply separable convolutions. With this optimization, our BiFPN further improves accuracy by 4% while reducing computational costs by 50%.

The third optimization involves achieving the best compromise between accuracy and efficiency under various resource constraints. Our previous work has shown that co-scaling the depth, width, and resolution of a network can significantly improve image recognition performance. Inspired by this idea, we propose a new composite scaling method for object detectors that collectively increases the resolution / depth / width. Each network component, ie backbone, object and block / class predictive network, will have one complex scaling factor that controls all scaling dimensions using heuristic rules. This approach makes it easy to determine how to scale the model by calculating a scale factor for a given target resource constraint.

By combining the new backbone and BiFPN, we first design a small EfficientDet-D0 baseline and then apply compound scaling to get EfficientDet-D1 to D7. Each serial model has a higher computational cost, covering a wide range of resource constraints from 3 billion FLOPs to 300 billion FLOPS, and provides higher accuracy.

Performance model

Evaluating EfficientDet on the COCO dataset, a widely used reference dataset for object detection. EfficientDet-D7 achieves an average average accuracy (mAP) of 52.2, which is 1.5 points higher than the previous modern model, using 4 times fewer parameters and 9.4 times fewer calculations

We also compared parameter size and CPU / GPU latency between EfficientDet and previous models. With similar accuracy constraints, EfficientDet models run 2-4 times faster on the GPU and 5-11 times faster on the processor than other detectors. While EfficientDet models are primarily designed for object detection, we also test their effectiveness in other tasks such as semantic segmentation. To perform segmentation tasks, we slightly modify EfficientDet-D4 by replacing the detection head and head loss and loss while maintaining the same scaled backbone and BiFPN. We compare this model to previous modern segmentation models for Pascal VOC 2012, a widely used segmentation testing dataset.

Given their exceptional performance, EfficientDet is expected to serve as a new foundation for future object detection research and potentially make highly accurate object detection models useful in many real-world applications. So opened all the breakpoints of the code and pretrained model on Github.com.

Recommended:

All The Advantages And Disadvantages Of Xiaomi Mi Pad 4 And Whether It Compares With The IPad

Xiaomi Mi Pad 4 is a tablet that has high performance and costs relatively little money. But is it worth the attention of consumers and is there a need for it? Design The appearance of the device is pleasant, it looks pretty good - the rear metal panel is laconic and does not leave fingerprints and smears on itself, and therefore the cover is only needed here for the safety of the device

Lenovo Phab And Lenovo Phab Plus: Overview And Specifications

Lenovo Phab Plus is a smartphone, the size of which is comparable to a small tablet, has very good technical characteristics and an affordable low price. Lenovo smartphones strike a good balance between price and quality, providing users with an affordable high quality product

How To Rotate The Camera Around An Object

When creating an object in 3D editors, it is important to consider the model from all sides, to determine how it will look from different angles. By rotating the camera around the object, you can find flaws in time and fix them. Instructions Step 1 To rotate the camera around an object in MilkShape 3D, you must initially point it towards the object

Alcatel Idol 5 And 5s: Review And Specifications, Comparison With Idol 4 And 4s

In 2017, Alcatel again delighted consumers with its devices - the idol 5 and its better version of the idol 5s. But are they that good compared to her previous generation of the series? In order to answer this question, let's look at the characteristics of new smartphones

Samsung Galaxy S8 And S8 Plus (Samsung S8 And S8 Plus) - Review And Presentation Of New Flagships, Specifications, Photos, Release Date, Price, Buy, Video

The Samsung Galaxy S8 and S8 Plus are the eighth generation of Samsung Electronics' Galaxy S series smartphones that are powerful and powerful. Samsung Galaxy S8 and S8 Plus features The Samsung Galaxy S8 was released alongside the S8 Plus on March 29, 2017

EfficientDet: Towards Scalable And Efficient Object Detection

Table of contents:

Optimizing Model Architecture

Performance model

Recommended:

All The Advantages And Disadvantages Of Xiaomi Mi Pad 4 And Whether It Compares With The IPad

Lenovo Phab And Lenovo Phab Plus: Overview And Specifications

How To Rotate The Camera Around An Object

Alcatel Idol 5 And 5s: Review And Specifications, Comparison With Idol 4 And 4s

Samsung Galaxy S8 And S8 Plus (Samsung S8 And S8 Plus) - Review And Presentation Of New Flagships, Specifications, Photos, Release Date, Price, Buy, Video

How To Print A Price Tag

How Augmented Reality Glasses Will Work

How To Send A Postcard To Your Mobile For Free

12 Significant Events Of 2021

How To Amplify Sound From A Microphone

How To Call The Beeline Support Service

How To Access Mobile Contacts

How To Send Free SMS To Megafon Volga Region

How To Call The Police From A Mobile Phone

How To Choose A Solar Panel

How To Put Games On Your Phone

How To Install A Game On A Phone From A Computer

How To Find Out The Subscriber Number Of Megafon

How To Cut A SIM Card For IPhone Or IPad

How To Check If The Phone Is Tapped Or Not